new StemmerJa()
Methods
isKatakana(str) → {boolean}
Is a string made of fullwidth katakana only?
This implementation is the fastest I know:
http://jsperf.com/string-contain-katakana-only/2
Parameters:
Name | Type | Description |
---|---|---|
str |
string | A string. |
Returns:
True if the string has katakana only.
- Type
- boolean
stem(token) → {string}
Stem a term.
Parameters:
Name | Type | Description |
---|---|---|
token |
string |
Returns:
- Type
- string
stemKatakana(token) → {string}
Remove the final prolonged sound mark on katakana if length is superior to
a threshold.
Parameters:
Name | Type | Description |
---|---|---|
token |
string | A katakana string to stem. |
Returns:
A katakana string stemmed.
- Type
- string
tokenizeAndStem(text, keepStops) → {Array.<string>}
Tokenize and stem a text.
Stop words are excluded except if the second argument is true.
Parameters:
Name | Type | Description |
---|---|---|
text |
string | |
keepStops |
boolean | Whether to keep stop words from the output or not. |
Returns:
- Type
- Array.<string>