Class StringUtil


  • public final class StringUtil
    extends java.lang.Object
    Utilities for string processing.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static char TRUNCATE_CHAR  
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static java.lang.String capitalizeFirst​(java.lang.String text, java.util.Locale locale)  
      static <T extends java.lang.Comparable<T>>
      int
      compareToWithNulls​(T v1, T v2)
      Compare two values, which could be null.
      static java.lang.String compressSpaces​(java.lang.String str)
      Compresses spaces in case of non-preformatting paragraph.
      static java.util.List<java.lang.String> convertToList​(java.lang.String str)
      For a string containing a space-separated list of items, convert that string into an ArrayList
      static java.lang.String decodeBase64​(java.lang.String b64data, java.nio.charset.Charset charset)
      Decode the Base64-encoded charset bytes back to a String.
      static java.lang.String encodeBase64​(char[] chars, java.nio.charset.Charset charset)
      Convert a char array's charset bytes into a Base64-encoded String.
      static java.lang.String encodeBase64​(java.lang.String string, java.nio.charset.Charset charset)
      Convert a string's charset bytes into a Base64-encoded String.
      static boolean equal​(java.lang.String one, java.lang.String two)
      Compares two strings for equality.
      static java.lang.String escapeXMLChars​(int cp)
      Converts a single code point into valid XML.
      static java.lang.String firstN​(java.lang.String str, int len)
      Extracts first N codepoints from string.
      static java.lang.String format​(java.lang.String str, java.lang.Object... arguments)
      Formats UI strings.
      static int getFirstLetterLowercase​(java.lang.String s)
      Returns first letter in lowercase.
      static java.lang.String getTailSegments​(java.lang.String str, int separator, int segments)
      For a string delimited by some separator, retrieve the last segments segments.
      static boolean isCJK​(java.lang.String input)  
      static boolean isEmpty​(java.lang.String str)
      Check if string is empty, i.e.
      static boolean isLowerCase​(java.lang.String input)
      Returns true if the input has at least one letter and all letters are lower case.
      static boolean isMixedCase​(java.lang.String input)
      Returns true if the input has both upper case and lower case letters, but is not title case.
      static boolean isSubstringAfter​(java.lang.String text, int pos, java.lang.String substring)
      Checks if text contains substring after specified position.
      static boolean isSubstringBefore​(java.lang.String text, int pos, java.lang.String substring)
      Checks if text contains substring before specified position.
      static boolean isTitleCase​(int codePoint)  
      static boolean isTitleCase​(java.lang.String input)
      Returns true if the input is title case, meaning the first character is UpperCase or TitleCase* and the rest of the string (if present) is LowerCase.
      static boolean isUpperCase​(java.lang.String input)
      Returns true if the input is upper case.
      static boolean isValidXMLChar​(int codePoint)  
      static boolean isWhiteSpace​(int codePoint)
      Returns true if the input is a whitespace character (including non-breaking characters that are false according to Character.isWhitespace(int)).
      static boolean isWhiteSpace​(java.lang.String input)
      Returns true if the input consists only of whitespace characters (including non-breaking characters that are false according to Character.isWhitespace(int)).
      static java.lang.String makeValidXML​(java.lang.String plaintext)
      Converts a stream of plaintext into valid XML.
      static java.lang.String matchCapitalization​(java.lang.String text, java.lang.String matchTo, java.util.Locale locale)  
      static java.lang.String normalizeUnicode​(java.lang.CharSequence text)
      Apply Unicode NFC normalization to a string.
      static java.lang.String normalizeWidth​(java.lang.String text)
      Normalize the width of characters in the supplied text.
      static <T> T nvl​(T... values)
      Returns first not null object from list, or null if all values is null.
      static long nvlLong​(long... values)
      Returns first non-zero object from list, or zero if all values is null.
      static java.lang.String removeXMLInvalidChars​(java.lang.String str)
      Replace invalid XML chars by spaces.
      static java.lang.String replaceCase​(java.lang.String txt, java.util.Locale lang)
      Interpret the case replacement language used in regular expressions: backslash u = uppercase next letter backslash l = lowercase next letter backslash U = uppercase next letters until backslash E backslash L = lowercase next letters until backslash E backslash u + backslash L = uppercase next letter then lowercase all until backslash E backslash l + backslash U = lowercase next letter then uppercase all until backslash E Warning: this method works with the string you give to it; if you want to do other substitutions, such as variable conversions, they must be done before the call to replaceCase, else this method will not apply to the non-yet converted parts!
      static java.lang.String rstrip​(java.lang.String text)
      Strip whitespace from the end of a string.
      static java.lang.String stripFromEnd​(java.lang.String string, java.lang.String... toStrip)  
      static java.lang.String toTitleCase​(java.lang.String text, java.util.Locale locale)
      Convert text to title case according to the supplied locale.
      static java.lang.String truncate​(java.lang.String text, int len)
      Truncate the supplied text to a maximum of len codepoints.
      static java.lang.String unescapeXMLEntities​(java.lang.String text)
      Converts XML entities to characters.
      static java.lang.String wrap​(java.lang.String text, int length)
      Wrap line by length.
      • Methods inherited from class java.lang.Object

        equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Method Detail

      • isEmpty

        public static boolean isEmpty​(java.lang.String str)
        Check if string is empty, i.e. null or length==0
      • isLowerCase

        public static boolean isLowerCase​(java.lang.String input)
        Returns true if the input has at least one letter and all letters are lower case.
      • isUpperCase

        public static boolean isUpperCase​(java.lang.String input)
        Returns true if the input is upper case.
      • isMixedCase

        public static boolean isMixedCase​(java.lang.String input)
        Returns true if the input has both upper case and lower case letters, but is not title case.
      • isTitleCase

        public static boolean isTitleCase​(java.lang.String input)
        Returns true if the input is title case, meaning the first character is UpperCase or TitleCase* and the rest of the string (if present) is LowerCase.

        *There are exotic characters that are neither UpperCase nor LowerCase, but are TitleCase: e.g. LATIN CAPITAL LETTER L WITH SMALL LETTER J (U+01C8)
        These are handled correctly.

      • isTitleCase

        public static boolean isTitleCase​(int codePoint)
      • isWhiteSpace

        public static boolean isWhiteSpace​(java.lang.String input)
        Returns true if the input consists only of whitespace characters (including non-breaking characters that are false according to Character.isWhitespace(int)).
      • isWhiteSpace

        public static boolean isWhiteSpace​(int codePoint)
        Returns true if the input is a whitespace character (including non-breaking characters that are false according to Character.isWhitespace(int)).
      • isCJK

        public static boolean isCJK​(java.lang.String input)
      • capitalizeFirst

        public static java.lang.String capitalizeFirst​(java.lang.String text,
                                                       java.util.Locale locale)
      • replaceCase

        public static java.lang.String replaceCase​(java.lang.String txt,
                                                   java.util.Locale lang)
        Interpret the case replacement language used in regular expressions:
        • backslash u = uppercase next letter
        • backslash l = lowercase next letter
        • backslash U = uppercase next letters until backslash E
        • backslash L = lowercase next letters until backslash E
        • backslash u + backslash L = uppercase next letter then lowercase all until backslash E
        • backslash l + backslash U = lowercase next letter then uppercase all until backslash E
        Warning: this method works with the string you give to it; if you want to do other substitutions, such as variable conversions, they must be done before the call to replaceCase, else this method will not apply to the non-yet converted parts!
      • matchCapitalization

        public static java.lang.String matchCapitalization​(java.lang.String text,
                                                           java.lang.String matchTo,
                                                           java.util.Locale locale)
      • toTitleCase

        public static java.lang.String toTitleCase​(java.lang.String text,
                                                   java.util.Locale locale)
        Convert text to title case according to the supplied locale.
      • nvl

        @SafeVarargs
        public static <T> T nvl​(T... values)
        Returns first not null object from list, or null if all values is null.
      • nvlLong

        public static long nvlLong​(long... values)
        Returns first non-zero object from list, or zero if all values is null.
      • compareToWithNulls

        public static <T extends java.lang.Comparable<T>> int compareToWithNulls​(T v1,
                                                                                 T v2)
        Compare two values, which could be null.
      • firstN

        public static java.lang.String firstN​(java.lang.String str,
                                              int len)
        Extracts first N codepoints from string.
      • truncate

        public static java.lang.String truncate​(java.lang.String text,
                                                int len)
        Truncate the supplied text to a maximum of len codepoints. If truncated, the result will be the first (len - 1) codepoints plus a trailing ellipsis.
        Parameters:
        text - The text to truncate
        len - The desired length (in codepoints) of the result
        Returns:
        The truncated string
      • getFirstLetterLowercase

        public static int getFirstLetterLowercase​(java.lang.String s)
        Returns first letter in lowercase. Usually used for create tag shortcuts.
      • isSubstringAfter

        public static boolean isSubstringAfter​(java.lang.String text,
                                               int pos,
                                               java.lang.String substring)
        Checks if text contains substring after specified position.
      • isSubstringBefore

        public static boolean isSubstringBefore​(java.lang.String text,
                                                int pos,
                                                java.lang.String substring)
        Checks if text contains substring before specified position.
      • stripFromEnd

        public static java.lang.String stripFromEnd​(java.lang.String string,
                                                    java.lang.String... toStrip)
      • normalizeUnicode

        public static java.lang.String normalizeUnicode​(java.lang.CharSequence text)
        Apply Unicode NFC normalization to a string.
      • removeXMLInvalidChars

        public static java.lang.String removeXMLInvalidChars​(java.lang.String str)
        Replace invalid XML chars by spaces.
        Parameters:
        str - input stream
        Returns:
        result stream
        See Also:
        Supported chars
      • isValidXMLChar

        public static boolean isValidXMLChar​(int codePoint)
      • makeValidXML

        public static java.lang.String makeValidXML​(java.lang.String plaintext)
        Converts a stream of plaintext into valid XML. Output stream must convert stream to UTF-8 when saving to disk.
      • compressSpaces

        public static java.lang.String compressSpaces​(java.lang.String str)
        Compresses spaces in case of non-preformatting paragraph.
      • escapeXMLChars

        public static java.lang.String escapeXMLChars​(int cp)
        Converts a single code point into valid XML. Output stream must convert stream to UTF-8 when saving to disk.
      • unescapeXMLEntities

        public static java.lang.String unescapeXMLEntities​(java.lang.String text)
        Converts XML entities to characters.
      • equal

        public static boolean equal​(java.lang.String one,
                                    java.lang.String two)
        Compares two strings for equality. Handles nulls: if both strings are nulls they are considered equal.
      • format

        public static java.lang.String format​(java.lang.String str,
                                              java.lang.Object... arguments)
        Formats UI strings. Note: This is only a first attempt at putting right what goes wrong in MessageFormat. Currently it only duplicates single quotes, but it doesn't even test if the string contains parameters (numbers in curly braces), and it doesn't allow for string containg already escaped quotes.
        Parameters:
        str - The string to format
        arguments - Arguments to use in formatting the string
        Returns:
        The formatted string
      • normalizeWidth

        public static java.lang.String normalizeWidth​(java.lang.String text)
        Normalize the width of characters in the supplied text. Specifically:
        • ASCII characters will become halfwidth
        • Katakana characters will become fullwidth
        • Hangul will become fullwidth
        • Letter-like symbols and squared Latin abbreviations will be decomposed to ASCII
        This method was adapted from FullWidthConversionStep.java in the Okapi Framework under GPLv2+.
        Parameters:
        text -
        Returns:
        Normalized-width text
      • rstrip

        public static java.lang.String rstrip​(java.lang.String text)
        Strip whitespace from the end of a string. Uses Character.isWhitespace(int), so it does not strip the extra non-breaking whitespace included in isWhiteSpace(int).
        Parameters:
        text -
        Returns:
        text with trailing whitespace removed
      • encodeBase64

        public static java.lang.String encodeBase64​(java.lang.String string,
                                                    java.nio.charset.Charset charset)
        Convert a string's charset bytes into a Base64-encoded String.
        Parameters:
        string - a string
        charset - the charset with which to obtain the bytes
        Returns:
        Base64-encoded String
      • encodeBase64

        public static java.lang.String encodeBase64​(char[] chars,
                                                    java.nio.charset.Charset charset)
        Convert a char array's charset bytes into a Base64-encoded String. Useful for handling passwords. Intermediate buffers are cleared after use.
        Parameters:
        chars - a char array
        charset - the charset with which to obtain the bytes
        Returns:
        Base64-encoded String
      • decodeBase64

        public static java.lang.String decodeBase64​(java.lang.String b64data,
                                                    java.nio.charset.Charset charset)
        Decode the Base64-encoded charset bytes back to a String.
        Parameters:
        b64data - Base64-encoded String
        charset - charset of decoded bytes
        Returns:
        String
      • getTailSegments

        public static java.lang.String getTailSegments​(java.lang.String str,
                                                       int separator,
                                                       int segments)
        For a string delimited by some separator, retrieve the last segments segments.
        Parameters:
        str - The string
        separator - The separator delimiting the string's segments
        segments - The number of segments to return, starting at the end
        Returns:
        The trailing segments, or, if segments is greater than the number of segments contained in str, then str itself.
      • convertToList

        public static java.util.List<java.lang.String> convertToList​(java.lang.String str)
        For a string containing a space-separated list of items, convert that string into an ArrayList
        Parameters:
        str - The string, with items separated by whitespace
        Returns:
        An ArrayList of the items in the original space-separated list
      • wrap

        public static java.lang.String wrap​(java.lang.String text,
                                            int length)
        Wrap line by length.
        Parameters:
        text - string to process.
        length - wrap length.
        Returns:
        string wrapped.