Package org.omegat.tokenizer
Class WordIterator
- java.lang.Object
-
- java.text.BreakIterator
-
- org.omegat.tokenizer.WordIterator
-
- All Implemented Interfaces:
java.lang.Cloneable
public class WordIterator extends java.text.BreakIterator
BreakIterator for word-breaks with OmegaT heuristics, based on an instance of BreakIterator implementing word breaks.- See Also:
BreakIterator.getWordInstance()
-
-
Constructor Summary
Constructors Constructor Description WordIterator()
Creates a new instance of OmegaT's own word BreakIterator
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description int
current()
Return character index of the text boundary that was most recently returned by next(), previous(), first(), or last()int
first()
Return the first boundary.int
following(int offset)
Not yet implemented! Throws a RuntimeException if you try to call it. Return the first boundary following the specified offset.java.text.CharacterIterator
getText()
Not yet implemented! Throws a RuntimeException if you try to call it. Get the text being scannedint
last()
Not yet implemented! Throws a RuntimeException if you try to call it. Return the last boundary.int
next()
Return the boundary of the word following the current boundary.int
next(int n)
Not yet implemented! Throws a RuntimeException if you try to call it. Return the nth boundary from the current boundaryint
previous()
Not yet implemented! Throws a RuntimeException if you try to call it. Return the boundary preceding the current boundary.void
setText(java.lang.String newText)
Set a new text string to be scanned.void
setText(java.text.CharacterIterator newText)
Not yet implemented! Throws a RuntimeException if you try to call it. Set a new text for scanning.
-
-
-
Method Detail
-
setText
public void setText(java.lang.String newText)
Set a new text string to be scanned. The current scan position is reset to first().- Overrides:
setText
in classjava.text.BreakIterator
- Parameters:
newText
- new text to scan.
-
first
public int first()
Return the first boundary. The iterator's current position is set to the first boundary.- Specified by:
first
in classjava.text.BreakIterator
- Returns:
- The character index of the first text boundary.
-
current
public int current()
Return character index of the text boundary that was most recently returned by next(), previous(), first(), or last()- Specified by:
current
in classjava.text.BreakIterator
- Returns:
- The boundary most recently returned.
-
next
public int next()
Return the boundary of the word following the current boundary.Note: This iterator skips OmegaT-specific tags, and groups [text-]mnemonics-text into a single token.
- Specified by:
next
in classjava.text.BreakIterator
- Returns:
- The character index of the next text boundary or DONE if all boundaries have been returned. Equivalent to next(1).
-
next
public int next(int n)
Not yet implemented! Throws a RuntimeException if you try to call it. Return the nth boundary from the current boundary- Specified by:
next
in classjava.text.BreakIterator
- Parameters:
n
- which boundary to return. A value of 0 does nothing. Negative values move to previous boundaries and positive values move to later boundaries.- Returns:
- The index of the nth boundary from the current position.
-
following
public int following(int offset)
Not yet implemented! Throws a RuntimeException if you try to call it. Return the first boundary following the specified offset. The value returned is always greater than the offset or the value BreakIterator.DONE- Specified by:
following
in classjava.text.BreakIterator
- Parameters:
offset
- the offset to begin scanning. Valid values are determined by the CharacterIterator passed to setText(). Invalid values cause an IllegalArgumentException to be thrown.- Returns:
- The first boundary after the specified offset.
-
setText
public void setText(java.text.CharacterIterator newText)
Not yet implemented! Throws a RuntimeException if you try to call it. Set a new text for scanning. The current scan position is reset to first().- Specified by:
setText
in classjava.text.BreakIterator
- Parameters:
newText
- new text to scan.
-
getText
public java.text.CharacterIterator getText()
Not yet implemented! Throws a RuntimeException if you try to call it. Get the text being scanned- Specified by:
getText
in classjava.text.BreakIterator
- Returns:
- the text being scanned
-
previous
public int previous()
Not yet implemented! Throws a RuntimeException if you try to call it. Return the boundary preceding the current boundary.- Specified by:
previous
in classjava.text.BreakIterator
- Returns:
- The character index of the previous text boundary or DONE if all boundaries have been returned.
-
last
public int last()
Not yet implemented! Throws a RuntimeException if you try to call it. Return the last boundary. The iterator's current position is set to the last boundary.- Specified by:
last
in classjava.text.BreakIterator
- Returns:
- The character index of the last text boundary.
-
-