Package org.omegat.util
Class StringUtil
- java.lang.Object
-
- org.omegat.util.StringUtil
-
public final class StringUtil extends java.lang.ObjectUtilities for string processing.
-
-
Field Summary
Fields Modifier and Type Field Description static charTRUNCATE_CHAR
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static java.lang.StringcapitalizeFirst(java.lang.String text, java.util.Locale locale)static <T extends java.lang.Comparable<T>>
intcompareToWithNulls(T v1, T v2)Compare two values, which could be null.static java.lang.StringcompressSpaces(java.lang.String str)Compresses spaces in case of non-preformatting paragraph.static java.util.List<java.lang.String>convertToList(java.lang.String str)For a string containing a space-separated list of items, convert that string into an ArrayListstatic java.lang.StringdecodeBase64(java.lang.String b64data, java.nio.charset.Charset charset)Decode the Base64-encodedcharsetbytes back to a String.static java.lang.StringencodeBase64(char[] chars, java.nio.charset.Charset charset)Convert a char array'scharsetbytes into a Base64-encoded String.static java.lang.StringencodeBase64(java.lang.String string, java.nio.charset.Charset charset)Convert a string'scharsetbytes into a Base64-encoded String.static booleanequal(java.lang.String one, java.lang.String two)Compares two strings for equality.static java.lang.StringescapeXMLChars(int cp)Converts a single code point into valid XML.static java.lang.StringfirstN(java.lang.String str, int len)Extracts first N codepoints from string.static java.lang.Stringformat(java.lang.String str, java.lang.Object... arguments)Formats UI strings.static intgetFirstLetterLowercase(java.lang.String s)Returns first letter in lowercase.static java.lang.StringgetTailSegments(java.lang.String str, int separator, int segments)For a string delimited by some separator, retrieve the lastsegmentssegments.static booleanisCJK(java.lang.String input)static booleanisEmpty(java.lang.String str)Check if string is empty, i.e.static booleanisLowerCase(java.lang.String input)Returns true if the input has at least one letter and all letters are lower case.static booleanisMixedCase(java.lang.String input)Returns true if the input has both upper case and lower case letters, but is not title case.static booleanisSubstringAfter(java.lang.String text, int pos, java.lang.String substring)Checks if text contains substring after specified position.static booleanisSubstringBefore(java.lang.String text, int pos, java.lang.String substring)Checks if text contains substring before specified position.static booleanisTitleCase(int codePoint)static booleanisTitleCase(java.lang.String input)Returns true if the input is title case, meaning the first character is UpperCase or TitleCase* and the rest of the string (if present) is LowerCase.static booleanisUpperCase(java.lang.String input)Returns true if the input is upper case.static booleanisValidXMLChar(int codePoint)static booleanisWhiteSpace(int codePoint)Returns true if the input is a whitespace character (including non-breaking characters that are false according toCharacter.isWhitespace(int)).static booleanisWhiteSpace(java.lang.String input)Returns true if the input consists only of whitespace characters (including non-breaking characters that are false according toCharacter.isWhitespace(int)).static java.lang.StringmakeValidXML(java.lang.String plaintext)Converts a stream of plaintext into valid XML.static java.lang.StringmatchCapitalization(java.lang.String text, java.lang.String matchTo, java.util.Locale locale)static java.lang.StringnormalizeUnicode(java.lang.CharSequence text)Apply Unicode NFC normalization to a string.static java.lang.StringnormalizeWidth(java.lang.String text)Normalize the width of characters in the supplied text.static <T> Tnvl(T... values)Returns first not null object from list, or null if all values is null.static longnvlLong(long... values)Returns first non-zero object from list, or zero if all values is null.static java.lang.StringremoveXMLInvalidChars(java.lang.String str)Replace invalid XML chars by spaces.static java.lang.StringreplaceCase(java.lang.String txt, java.util.Locale lang)Interpret the case replacement language used in regular expressions: backslash u = uppercase next letter backslash l = lowercase next letter backslash U = uppercase next letters until backslash E backslash L = lowercase next letters until backslash E backslash u + backslash L = uppercase next letter then lowercase all until backslash E backslash l + backslash U = lowercase next letter then uppercase all until backslash E Warning: this method works with the string you give to it; if you want to do other substitutions, such as variable conversions, they must be done before the call to replaceCase, else this method will not apply to the non-yet converted parts!static java.lang.Stringrstrip(java.lang.String text)Strip whitespace from the end of a string.static java.lang.StringstripFromEnd(java.lang.String string, java.lang.String... toStrip)static java.lang.StringtoTitleCase(java.lang.String text, java.util.Locale locale)Convert text to title case according to the supplied locale.static java.lang.Stringtruncate(java.lang.String text, int len)Truncate the supplied text to a maximum of len codepoints.static java.lang.StringunescapeXMLEntities(java.lang.String text)Converts XML entities to characters.static java.lang.Stringwrap(java.lang.String text, int length)Wrap line by length.
-
-
-
Field Detail
-
TRUNCATE_CHAR
public static final char TRUNCATE_CHAR
- See Also:
- Constant Field Values
-
-
Method Detail
-
isEmpty
public static boolean isEmpty(java.lang.String str)
Check if string is empty, i.e. null or length==0
-
isLowerCase
public static boolean isLowerCase(java.lang.String input)
Returns true if the input has at least one letter and all letters are lower case.
-
isUpperCase
public static boolean isUpperCase(java.lang.String input)
Returns true if the input is upper case.
-
isMixedCase
public static boolean isMixedCase(java.lang.String input)
Returns true if the input has both upper case and lower case letters, but is not title case.
-
isTitleCase
public static boolean isTitleCase(java.lang.String input)
Returns true if the input is title case, meaning the first character is UpperCase or TitleCase* and the rest of the string (if present) is LowerCase.*There are exotic characters that are neither UpperCase nor LowerCase, but are TitleCase: e.g. LATIN CAPITAL LETTER L WITH SMALL LETTER J (U+01C8)
These are handled correctly.
-
isTitleCase
public static boolean isTitleCase(int codePoint)
-
isWhiteSpace
public static boolean isWhiteSpace(java.lang.String input)
Returns true if the input consists only of whitespace characters (including non-breaking characters that are false according toCharacter.isWhitespace(int)).
-
isWhiteSpace
public static boolean isWhiteSpace(int codePoint)
Returns true if the input is a whitespace character (including non-breaking characters that are false according toCharacter.isWhitespace(int)).
-
isCJK
public static boolean isCJK(java.lang.String input)
-
capitalizeFirst
public static java.lang.String capitalizeFirst(java.lang.String text, java.util.Locale locale)
-
replaceCase
public static java.lang.String replaceCase(java.lang.String txt, java.util.Locale lang)Interpret the case replacement language used in regular expressions:- backslash u = uppercase next letter
- backslash l = lowercase next letter
- backslash U = uppercase next letters until backslash E
- backslash L = lowercase next letters until backslash E
- backslash u + backslash L = uppercase next letter then lowercase all until backslash E
- backslash l + backslash U = lowercase next letter then uppercase all until backslash E
-
matchCapitalization
public static java.lang.String matchCapitalization(java.lang.String text, java.lang.String matchTo, java.util.Locale locale)
-
toTitleCase
public static java.lang.String toTitleCase(java.lang.String text, java.util.Locale locale)Convert text to title case according to the supplied locale.
-
nvl
@SafeVarargs public static <T> T nvl(T... values)
Returns first not null object from list, or null if all values is null.
-
nvlLong
public static long nvlLong(long... values)
Returns first non-zero object from list, or zero if all values is null.
-
compareToWithNulls
public static <T extends java.lang.Comparable<T>> int compareToWithNulls(T v1, T v2)Compare two values, which could be null.
-
firstN
public static java.lang.String firstN(java.lang.String str, int len)Extracts first N codepoints from string.
-
truncate
public static java.lang.String truncate(java.lang.String text, int len)Truncate the supplied text to a maximum of len codepoints. If truncated, the result will be the first (len - 1) codepoints plus a trailing ellipsis.- Parameters:
text- The text to truncatelen- The desired length (in codepoints) of the result- Returns:
- The truncated string
-
getFirstLetterLowercase
public static int getFirstLetterLowercase(java.lang.String s)
Returns first letter in lowercase. Usually used for create tag shortcuts.
-
isSubstringAfter
public static boolean isSubstringAfter(java.lang.String text, int pos, java.lang.String substring)Checks if text contains substring after specified position.
-
isSubstringBefore
public static boolean isSubstringBefore(java.lang.String text, int pos, java.lang.String substring)Checks if text contains substring before specified position.
-
stripFromEnd
public static java.lang.String stripFromEnd(java.lang.String string, java.lang.String... toStrip)
-
normalizeUnicode
public static java.lang.String normalizeUnicode(java.lang.CharSequence text)
Apply Unicode NFC normalization to a string.
-
removeXMLInvalidChars
public static java.lang.String removeXMLInvalidChars(java.lang.String str)
Replace invalid XML chars by spaces.- Parameters:
str- input stream- Returns:
- result stream
- See Also:
- Supported chars
-
isValidXMLChar
public static boolean isValidXMLChar(int codePoint)
-
makeValidXML
public static java.lang.String makeValidXML(java.lang.String plaintext)
Converts a stream of plaintext into valid XML. Output stream must convert stream to UTF-8 when saving to disk.
-
compressSpaces
public static java.lang.String compressSpaces(java.lang.String str)
Compresses spaces in case of non-preformatting paragraph.
-
escapeXMLChars
public static java.lang.String escapeXMLChars(int cp)
Converts a single code point into valid XML. Output stream must convert stream to UTF-8 when saving to disk.
-
unescapeXMLEntities
public static java.lang.String unescapeXMLEntities(java.lang.String text)
Converts XML entities to characters.
-
equal
public static boolean equal(java.lang.String one, java.lang.String two)Compares two strings for equality. Handles nulls: if both strings are nulls they are considered equal.
-
format
public static java.lang.String format(java.lang.String str, java.lang.Object... arguments)Formats UI strings. Note: This is only a first attempt at putting right what goes wrong in MessageFormat. Currently it only duplicates single quotes, but it doesn't even test if the string contains parameters (numbers in curly braces), and it doesn't allow for string containg already escaped quotes.- Parameters:
str- The string to formatarguments- Arguments to use in formatting the string- Returns:
- The formatted string
-
normalizeWidth
public static java.lang.String normalizeWidth(java.lang.String text)
Normalize the width of characters in the supplied text. Specifically:- ASCII characters will become halfwidth
- Katakana characters will become fullwidth
- Hangul will become fullwidth
- Letter-like symbols and squared Latin abbreviations will be decomposed to ASCII
- Parameters:
text-- Returns:
- Normalized-width text
-
rstrip
public static java.lang.String rstrip(java.lang.String text)
Strip whitespace from the end of a string. UsesCharacter.isWhitespace(int), so it does not strip the extra non-breaking whitespace included inisWhiteSpace(int).- Parameters:
text-- Returns:
- text with trailing whitespace removed
-
encodeBase64
public static java.lang.String encodeBase64(java.lang.String string, java.nio.charset.Charset charset)Convert a string'scharsetbytes into a Base64-encoded String.- Parameters:
string- a stringcharset- the charset with which to obtain the bytes- Returns:
- Base64-encoded String
-
encodeBase64
public static java.lang.String encodeBase64(char[] chars, java.nio.charset.Charset charset)Convert a char array'scharsetbytes into a Base64-encoded String. Useful for handling passwords. Intermediate buffers are cleared after use.- Parameters:
chars- a char arraycharset- the charset with which to obtain the bytes- Returns:
- Base64-encoded String
-
decodeBase64
public static java.lang.String decodeBase64(java.lang.String b64data, java.nio.charset.Charset charset)Decode the Base64-encodedcharsetbytes back to a String.- Parameters:
b64data- Base64-encoded Stringcharset- charset of decoded bytes- Returns:
- String
-
getTailSegments
public static java.lang.String getTailSegments(java.lang.String str, int separator, int segments)For a string delimited by some separator, retrieve the lastsegmentssegments.- Parameters:
str- The stringseparator- The separator delimiting the string's segmentssegments- The number of segments to return, starting at the end- Returns:
- The trailing segments, or, if
segmentsis greater than the number of segments contained instr, thenstritself.
-
convertToList
public static java.util.List<java.lang.String> convertToList(java.lang.String str)
For a string containing a space-separated list of items, convert that string into an ArrayList- Parameters:
str- The string, with items separated by whitespace- Returns:
- An ArrayList of the items in the original space-separated list
-
wrap
public static java.lang.String wrap(java.lang.String text, int length)Wrap line by length.- Parameters:
text- string to process.length- wrap length.- Returns:
- string wrapped.
-
-