public class HunspellTokenizer extends BaseTokenizer
ITokenizer.StemmingMode
DEFAULT_TOKENS_COUNT, EMPTY_STRING_LIST, EMPTY_TOKENS_LIST, shouldDelegateTokenizeExactly, TOKENIZER_DEBUG_PROVIDER
Constructor and Description |
---|
HunspellTokenizer() |
Modifier and Type | Method and Description |
---|---|
java.lang.String[] |
getSupportedLanguages()
Return an array of language strings (
xx-yy ) indicating the tokenizer's
supported languages. |
protected org.apache.lucene.analysis.TokenStream |
getTokenStream(java.lang.String strOrig,
boolean stemsAllowed,
boolean stopWordsAllowed) |
static void |
loadPlugins() |
static void |
unloadPlugins() |
getEffectiveLanguage, getProjectLanguage, getStandardTokenStream, printTest, test, tokenize, tokenizeByCodePoint, tokenizeByCodePointToStrings, tokenizeToStrings, tokenizeVerbatim, tokenizeVerbatimToStrings, tokenizeWords, tokenizeWordsToStrings
protected org.apache.lucene.analysis.TokenStream getTokenStream(java.lang.String strOrig, boolean stemsAllowed, boolean stopWordsAllowed) throws java.io.IOException
getTokenStream
in class BaseTokenizer
java.io.IOException
public java.lang.String[] getSupportedLanguages()
ITokenizer
xx-yy
) indicating the tokenizer's
supported languages. Meant for tokenizers for which the supported languages
can only be determined at runtime, like the HunspellTokenizer
.
Indicate that this should be used by setting the Tokenizer
annotation
to contain only Tokenizer.DISCOVER_AT_RUNTIME
.
getSupportedLanguages
in interface ITokenizer
getSupportedLanguages
in class BaseTokenizer
public static void loadPlugins()
public static void unloadPlugins()