Class LuceneSmartChineseTokenizer

    • Constructor Detail

      • LuceneSmartChineseTokenizer

        public LuceneSmartChineseTokenizer()
    • Method Detail

      • tokenizeVerbatim

        public Token[] tokenizeVerbatim​(java.lang.String strOrig)
        Description copied from class: BaseTokenizer
        Breaks a string into tokens. Numbers, tags, and other non-word tokens are included in the result. Stemming is NOT used.

        This method is used to mark string differences in the UI and to tune similarity.

        Results are not cached.

        Specified by:
        tokenizeVerbatim in interface ITokenizer
        Overrides:
        tokenizeVerbatim in class BaseTokenizer
      • tokenizeVerbatimToStrings

        public java.lang.String[] tokenizeVerbatimToStrings​(java.lang.String strOrig)
        Description copied from interface: ITokenizer
        Breaks a string into strings. Numbers, tags, and other non-word tokens are included in the result. Stemming is NOT used.

        This method is used to mark string differences in the UI and for debugging purposes.

        Results are not cached.

        Specified by:
        tokenizeVerbatimToStrings in interface ITokenizer
        Overrides:
        tokenizeVerbatimToStrings in class BaseTokenizer