Class FindMatches


  • public class FindMatches
    extends java.lang.Object
    Class to find matches by specified criteria. Since we can use stemmers to prepare tokens, we should use 3-pass comparison of similarity. Similarity will be calculated in 3 steps: 1. Split original segment into word-only tokens using stemmer (with stop words list), then compare tokens. 2. Split original segment into word-only tokens without stemmer, then compare tokens. 3. Split original segment into not-only-words tokens (including numbers and tags) without stemmer, then compare tokens. This class is not thread safe ! Must be used in the one thread only.
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  FindMatches.StoppedException
      Process will throw this exception if it stopped.All callers must catch it and just skip.
    • Constructor Summary

      Constructors 
      Constructor Description
      FindMatches​(IProject project, int maxCount, boolean allowSeparateSegmentMatch, boolean searchExactlyTheSame)  
    • Constructor Detail

      • FindMatches

        public FindMatches​(IProject project,
                           int maxCount,
                           boolean allowSeparateSegmentMatch,
                           boolean searchExactlyTheSame)
        Parameters:
        searchExactlyTheSame - allows to search similarities with the same text as source segment. This mode used only for separate sentence match in paragraph project, i.e. where source is just part of current source.