|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectedu.rice.cs.plt.text.TextUtil
public final class TextUtil
| Nested Class Summary | |
|---|---|
static class |
TextUtil.SplitString
The result of a split() invocation. |
| Field Summary | |
|---|---|
static String |
NEWLINE
The system-dependent "line.separator" property. |
static String |
NEWLINE_PATTERN
A regex matching any line break: \r\n, \n, or \r. |
| Method Summary | |
|---|---|
static boolean |
contains(String s,
int character)
Determine if the given character occurs in s. |
static boolean |
contains(String s,
String piece)
Determine if the given string occurs in s. |
static boolean |
containsAll(String s,
int... characters)
Determine if all of the given characters occur in s. |
static boolean |
containsAll(String s,
String... pieces)
Determine if all of the given strings occur in s. |
static boolean |
containsAllIgnoreCase(String s,
String... pieces)
Determine if all of the given strings occur in s, ignoring differences in case. |
static boolean |
containsAny(String s,
int... characters)
Determine if any of the given characters occurs in s. |
static boolean |
containsAny(String s,
String... pieces)
Determine if any of the given strings occurs in s. |
static boolean |
containsAnyIgnoreCase(String s,
String... pieces)
Determine if any of the given strings occurs in s, ignoring differences in case. |
static boolean |
containsIgnoreCase(String s,
String piece)
Determine if the given string occurs in s, ignoring differences in case. |
static boolean |
endsWithAny(String s,
String... suffixes)
Determine if any of the given strings is a suffix of s. |
static SizedIterable<String> |
getLines(String s)
Break a string into a list of lines. |
static String |
htmlEscape(String s)
Convert the given string to an escaped form compatible with HTML. |
static String |
htmlUnescape(String s)
Interpret all HTML character entities in the given string. |
static int |
indexOfFirst(String s,
int... characters)
Find the first occurrence of any of the given characters in s. |
static int |
indexOfFirst(String s,
String... pieces)
Find the first occurrence of any of the given strings in s. |
static boolean |
isDecimalDigit(char c)
|
static boolean |
isHexDigit(char c)
|
static boolean |
isOctalDigit(char c)
|
static String |
javaEscape(String s)
Convert the given string to a form compatible with the Java language specification for character and string literals (see JLS 3.10.6). |
static String |
javaUnescape(String s)
Convert a string potentially containing Java character escapes (as in javaEscape(java.lang.String)) to its
unescaped equivalent. |
static String |
padLeft(String s,
char c,
int length)
Create a string of (at least) the given length by filling in copies of c to the left of s. |
static String |
padRight(String s,
char c,
int length)
Create a string of (at least) the given length by filling in copies of c to the right of s. |
static String |
prefix(String s,
int delim)
Extract the portion of s before the first occurrence of the given delimiter. |
static String |
regexEscape(String s)
Produce a regular expression that matches the given string. |
static String |
removePrefix(String s,
int delim)
Extract the portion of s after the first occurrence of the given delimiter. |
static String |
removeSuffix(String s,
int delim)
Extract the portion of s before the last occurrence of the given delimiter. |
static String |
repeat(char c,
int copies)
Produce a string by concatenating copies instances of c |
static String |
repeat(String s,
int copies)
Produce a string by concatenating copies instances of s |
static String |
sgmlEscape(String s,
Map<Character,String> entities,
boolean convertToAscii)
Convert the given string to a form containing SGML character entities. |
static String |
sgmlUnescape(String s,
Map<String,Character> entities)
Interpret all SGML character entities in the given string according to the provided name-character mapping. |
static TextUtil.SplitString |
split(String s,
String delimRegex,
Bracket... brackets)
An extended version of split(java.lang.String, java.lang.String, edu.rice.cs.plt.text.Bracket...) that recognizes nested matched brackets and only splits
where the delimiter occurs at the top level. |
static TextUtil.SplitString |
split(String s,
String delimRegex,
int limit,
Bracket... brackets)
An extended version of split(java.lang.String, java.lang.String, edu.rice.cs.plt.text.Bracket...) that recognizes nested matched brackets and only splits
where the delimiter occurs at the top level. |
static TextUtil.SplitString |
splitWithParens(String s,
String delimRegex)
An extended version of split(java.lang.String, java.lang.String, edu.rice.cs.plt.text.Bracket...) that recognizes nested parentheses and only splits
where the delimiter occurs at the top level. |
static TextUtil.SplitString |
splitWithParens(String s,
String delimRegex,
int limit)
An extended version of split(java.lang.String, java.lang.String, edu.rice.cs.plt.text.Bracket...) that recognizes nested parentheses and only splits
where the delimiter occurs at the top level. |
static boolean |
startsWithAny(String s,
String... prefixes)
Determine if any of the given strings is a prefix of s. |
static String |
suffix(String s,
int delim)
Extract the portion of s after the last occurrence of the given delimiter. |
static String |
toHexString(byte[] bs)
Express a byte array as a sequence of unsigned hexadecimal bytes. |
static String |
toHexString(byte[] bs,
int offset,
int length)
Express a byte array as a sequence of unsigned hexadecimal bytes. |
static String |
toString(Object o)
Convert the given object to a string. |
static String |
unicodeEscape(String s)
Convert all non-ASCII characters in the string to Unicode escapes, as specified by JLS 3.3. |
static String |
unicodeUnescape(String s)
Convert all Unicode escapes in the string into their equivalent Unicode characters, as specified by JLS 3.3. |
static String |
unicodeUnescapeOnce(String s)
Convert all one-level Unicode escapes in the string to their equivalent characters, as specified by JLS 3.3. |
static String |
xmlEscape(String s)
Convert the given string to an escaped form compatible with XML. |
static String |
xmlEscape(String s,
boolean convertToAscii)
Convert the given string to an escaped form compatible with XML. |
static String |
xmlUnescape(String s)
Interpret all XML character entities in the given string. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final String NEWLINE
public static final String NEWLINE_PATTERN
\r\n, \n, or \r.
| Method Detail |
|---|
public static String toString(Object o)
RecurUtil.safeToString(Object)
to provide simple, safe handling of null values, arrays, and self-referential data structures
(with cooperation from the toString() method of the relevant class).
public static SizedIterable<String> getLines(String s)
"\n", "\r", and "\r\n"
are considered line delimiters. The empty string is taken to contain 0 lines. An optional final
trailing newline will be ignored.
public static String repeat(String s,
int copies)
copies instances of s
public static String repeat(char c,
int copies)
copies instances of c
public static String padLeft(String s,
char c,
int length)
c to the left of s.
public static String padRight(String s,
char c,
int length)
c to the right of s.
public static boolean contains(String s,
int character)
s. Defined in terms of
String.indexOf(int).
public static boolean contains(String s,
String piece)
s. Defined in terms of String.indexOf(String).
This is also defined as contains(java.lang.String, int), but is defined here for legacy support.
public static boolean containsAny(String s,
int... characters)
s. Defined in terms of
String.indexOf(int).
public static boolean containsAny(String s,
String... pieces)
s. Defined in terms of
String.indexOf(String).
public static boolean containsAll(String s,
int... characters)
s. Defined in terms of
String.indexOf(int).
public static boolean containsAll(String s,
String... pieces)
s. Defined in terms of
String.indexOf(String).
public static boolean containsIgnoreCase(String s,
String piece)
s, ignoring differences in case. Unlike
String.equalsIgnoreCase(java.lang.String), this test only compares the lower-case conversion of
s to the lower-case conversion of piece.
public static boolean containsAnyIgnoreCase(String s,
String... pieces)
s, ignoring differences in case. Defined in
terms of containsIgnoreCase(java.lang.String, java.lang.String).
public static boolean containsAllIgnoreCase(String s,
String... pieces)
s, ignoring differences in case. Defined in
terms of containsIgnoreCase(java.lang.String, java.lang.String).
public static boolean startsWithAny(String s,
String... prefixes)
s. Defined in terms of
String.startsWith(java.lang.String, int).
public static boolean endsWithAny(String s,
String... suffixes)
s. Defined in terms of
String.endsWith(java.lang.String).
public static int indexOfFirst(String s,
int... characters)
s. If none are present, the result is
-1. Defined in terms of String.indexOf(int).
public static int indexOfFirst(String s,
String... pieces)
s. If none are present, the result is
-1. Defined in terms of String.indexOf(String).
public static String prefix(String s,
int delim)
s before the first occurrence of the given delimiter. s if the
delimiter is not found.
public static String removePrefix(String s,
int delim)
s after the first occurrence of the given delimiter. s if the
delimiter is not found.
public static String suffix(String s,
int delim)
s after the last occurrence of the given delimiter. s if the
delimiter is not found.
public static String removeSuffix(String s,
int delim)
s before the last occurrence of the given delimiter. s if the
delimiter is not found.
public static TextUtil.SplitString splitWithParens(String s,
String delimRegex)
split(java.lang.String, java.lang.String, edu.rice.cs.plt.text.Bracket...) that recognizes nested parentheses and only splits
where the delimiter occurs at the top level. This convenience method sets limit to 0
(unlimited number of matches) and brackets to Bracket.PARENTHESES. See
split(String, String, int, Bracket[]) for a full specification.
public static TextUtil.SplitString splitWithParens(String s,
String delimRegex,
int limit)
split(java.lang.String, java.lang.String, edu.rice.cs.plt.text.Bracket...) that recognizes nested parentheses and only splits
where the delimiter occurs at the top level. This convenience method sets brackets to
Bracket.PARENTHESES. See split(String, String, int, Bracket[]) for a full
specification.
public static TextUtil.SplitString split(String s,
String delimRegex,
Bracket... brackets)
split(java.lang.String, java.lang.String, edu.rice.cs.plt.text.Bracket...) that recognizes nested matched brackets and only splits
where the delimiter occurs at the top level. This convenience method sets limit to 0
(unlimited number of matches). See split(String, String, int, Bracket[]) for a full
specification.
public static TextUtil.SplitString split(String s,
String delimRegex,
int limit,
Bracket... brackets)
split(java.lang.String, java.lang.String, edu.rice.cs.plt.text.Bracket...) that recognizes nested matched brackets and only splits
where the delimiter occurs at the top level. For convenience when the delimiter is a nontrivial
regular expression, the result includes both the split strings and the matched delimiters. Ignoring
these extensions, the behavior is roughly equivalent: s.split(delimRegex, limit) is equivalent
to TextUtil.split(s, delimRegex, limit).array(), with the exception that trailing empty strings
(separated by delimiters) are never discarded here.
s - A string to splitdelimRegex - A regular expression recognizing delimiterslimit - The number of non-delimiter pieces to produce. Consistent with String.split(),
limit-1 is the number of delimiters to search for. If 0 or negative, the
search continues until the string is exhausted. Unlike String.split(), trailing
empty strings (separated by delimiters) are never discarded, even when limit == 0.brackets - Bracket pairs that should be recognized. A delimiter match that occurs within one of
these bracket pairs (at any nonzero nesting depth) is not considered a delimiter.
A left bracket increases the nesting level only if it is at the top level or follows
another left bracket that supports nesting; a right bracket reduces the nesting level
only if it matches the most recent left bracket. If delimRegex recognizes part
of a valid bracket (e.g., "*" is the delimiter and "/*" is a bracket),
how relevant text is handled is unspecified (it would be nice, but difficult, to fix this).
If multiple brackets overlap, an expected right bracket will match before a left bracket,
and the first left bracket listed in brackets has priority over later left
brackets.public static String toHexString(byte[] bs)
public static String toHexString(byte[] bs,
int offset,
int length)
public static boolean isDecimalDigit(char c)
public static boolean isOctalDigit(char c)
public static boolean isHexDigit(char c)
public static String unicodeEscape(String s)
u is added to existing escapes in the string;
instances of \ that precede a non-ASCII character or a malformed Unicode escape will
be encoded as \u005c. The original string may be safely reconstructed with
unicodeUnescapeOnce(java.lang.String); to safely interpret all Unicode escapes, including
those in the original string, use unicodeUnescape(java.lang.String) (in either case, this method
guarantees an absence of IllegalArgumentExceptions).
public static String unicodeUnescapeOnce(String s)
IllegalArgumentException - If a backslash-u escape in the string is not followed by 4 hex digitspublic static String unicodeUnescape(String s)
IllegalArgumentException - If a backslash-u escape in the string is not followed by 4 hex digitspublic static String javaEscape(String s)
\, ", and ' are replaced with escape
sequences. All control characters between \u0000 and \u001F, along with
\u007F, are replaced with mnemonic escape sequences (such as "\n"), or octal escape
sequences if no mnemonic exists.
public static String javaUnescape(String s)
javaEscape(java.lang.String)) to its
unescaped equivalent. Note that Unicode escapes are not interpreted (strings from Java source
code should first be processed by unicodeUnescape(java.lang.String)).
IllegalArgumentException - If the character \ is followed by an invalid escape character
or the end of the string.public static String regexEscape(String s)
Produce a regular expression that matches the given string. Backslash escape sequences are
used for all characters that potentially clash with regular expression syntax. For simplicity,
escapes are applied to all control characters (\u0000 to \u001F and
\u007F) and to all non-alphanumeric, non-space ASCII characters (in the range
\u0020 to \u007E), including those that have no special meaning in
the regular expression syntax (such as @, ", and ~). Where a
mnemonic escape for control characters exists, it is used; otherwise, the hexadecimal \xhh
notation is used.
Note: a similar method is available in Java 5: Pattern.quote(java.lang.String). It has the same basic
contract — produce a regex to match the given string — but produces different (equivalent)
results.
public static String sgmlEscape(String s,
Map<Character,String> entities,
boolean convertToAscii)
entities will be translated to their corrresponding entity names; if convertToAscii is
true, all other non-ASCII characters will be converted to numeric references.
public static String sgmlUnescape(String s,
Map<String,Character> entities)
IllegalArgumentException - If the string contains a malformed or unrecognized character entitypublic static String xmlEscape(String s)
", &, ', <, and >) will be replaced with named references
(such as "), and all non-ASCII characters will be replaced with numeric references.
public static String xmlEscape(String s,
boolean convertToAscii)
", &, ', <, and >) will be replaced with named references
(such as "); if convertToAscii is true, all non-ASCII characters
will be replaced with numeric references.
public static String xmlUnescape(String s)
IllegalArgumentException - If the string contains a malformed or unrecognized character entitypublic static String htmlEscape(String s)
' character will also
be replaced with a numeric refererence.
public static String htmlUnescape(String s)
IllegalArgumentException - If the string contains a malformed or unrecognized character entity
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||