JavaScript is disabled on your browser.
Skip navigation links
Overview
Package
Class
Tree
Deprecated
Index
Help
Prev Package
Next Package
Frames
No Frames
All Classes
Package org.omegat.tokenizer
Interface Summary
Interface
Description
ITokenizer
Interface for tokenize string engine.
Class Summary
Class
Description
BaseTokenizer
Base class for Lucene-based tokenizers.
DefaultTokenizer
Methods for tokenize string.
HunspellTokenizer
Methods for tokenize string.
LuceneArabicTokenizer
LuceneArmenianTokenizer
LuceneBasqueTokenizer
LuceneBrazilianTokenizer
LuceneBulgarianTokenizer
LuceneCatalanTokenizer
LuceneCJKTokenizer
LuceneCzechTokenizer
LuceneDanishTokenizer
LuceneDutchTokenizer
LuceneEnglishTokenizer
LuceneFinnishTokenizer
LuceneFrenchTokenizer
LuceneGalicianTokenizer
LuceneGermanTokenizer
LuceneGreekTokenizer
LuceneHindiTokenizer
LuceneHungarianTokenizer
LuceneIndonesianTokenizer
LuceneIrishTokenizer
LuceneItalianTokenizer
LuceneJapaneseTokenizer
LuceneLatvianTokenizer
LuceneNorwegianTokenizer
LucenePersianTokenizer
LucenePolishTokenizer
LucenePortugueseTokenizer
LuceneRomanianTokenizer
LuceneRussianTokenizer
LuceneSmartChineseTokenizer
LuceneSpanishTokenizer
LuceneSwedishTokenizer
LuceneThaiTokenizer
LuceneTurkishTokenizer
WordIterator
BreakIterator for word-breaks with OmegaT heuristics, based on an instance of BreakIterator implementing word breaks.
Enum Summary
Enum
Description
ITokenizer.StemmingMode
Annotation Types Summary
Annotation Type
Description
Tokenizer
Annotation to indicate the languages for which a tokenizer is intended for use.
Skip navigation links
Overview
Package
Class
Tree
Deprecated
Index
Help
Prev Package
Next Package
Frames
No Frames
All Classes