Class: SimilarSearch

SimilarSearch()

Class for checking similarity between strings, or search the more similar substring inside an string.

Constructor

new SimilarSearch()

Constructor of the class. Does the basic initializations.
Source:

Methods

getBestSubstring(str1, str2, words1) → {Object}

Given two strings, search best occurence of the second inside the first, that is, the consecutive words of the first string that have less levenshtein distance with the second one.
Parameters:
Name Type Description
str1 String First string.
str2 String Second string.
words1 Array.<Object> Array of positions of the words of the first string. If not provided this will be built.
Source:
Returns:
Best occurence, expressed as the index of the first character, index of the last character, levenshtein distance and accuracy.
Type
Object

getBestSubstringList(str1, str2, words1) → {Array.<Object>}

Given two strings, search all the occurences of the second inside the first, where the accuracy is at least as good as the threshold.
Parameters:
Name Type Description
str1 String First string.
str2 String Second string.
words1 Array.<Object> Array of positions of the words of the first string. If not provided this will be built.
Source:
Returns:
List of occurences.
Type
Array.<Object>

getEdgesFromEntities(str, entities, locale, whitelist)

Given an utterance and an array of entities with options, search the best option for each entity and return the results.
Parameters:
Name Type Description
str String Utterance to retrieve entities.
entities Array.<Object> Entities Array.
locale String Locale for the search.
whitelist Array.<String> Whitelist of entity names for the search.
Source:

getSimilarity(str1, str2) → {Number}

Calculates the levenshtein distance between two strings.
Parameters:
Name Type Description
str1 String First String.
str2 String Second String.
Source:
Returns:
Levenshtein distance.
Type
Number

getWordPositions(str) → {Array.<Object>}

Given an string, iterates over it and return the start position, end position and length of each of the words, without tokenizing the string.
Parameters:
Name Type Description
str String String to be processed.
Source:
Returns:
Array of positions of the words, with the start index, end index, and length.
Type
Array.<Object>

isAlphanumeric(c) → {Boolean}

Indicates if a character is alphanumeric.
Parameters:
Name Type Description
c Character Character.
Source:
Returns:
True if the character is alphanumeric, false otherwise.
Type
Boolean