Class: StemmerJa

StemmerJa()

new StemmerJa()

Source:

Methods

isKatakana(str) → {boolean}

Is a string made of fullwidth katakana only? This implementation is the fastest I know: http://jsperf.com/string-contain-katakana-only/2
Parameters:
Name Type Description
str string A string.
Source:
Returns:
True if the string has katakana only.
Type
boolean

stem(token) → {string}

Stem a term.
Parameters:
Name Type Description
token string
Source:
Returns:
Type
string

stemKatakana(token) → {string}

Remove the final prolonged sound mark on katakana if length is superior to a threshold.
Parameters:
Name Type Description
token string A katakana string to stem.
Source:
Returns:
A katakana string stemmed.
Type
string

tokenizeAndStem(text, keepStops) → {Array.<string>}

Tokenize and stem a text. Stop words are excluded except if the second argument is true.
Parameters:
Name Type Description
text string
keepStops boolean Whether to keep stop words from the output or not.
Source:
Returns:
Type
Array.<string>