Package org.apache.iceberg.util
Class UnicodeUtil
java.lang.Object
org.apache.iceberg.util.UnicodeUtil
-
Method Summary
Modifier and TypeMethodDescriptionstatic boolean
isCharHighSurrogate
(char ch) Determines if the given character value is a unicode high-surrogate code unit.static CharSequence
truncateString
(CharSequence input, int length) Truncates the input charSequence such that the truncated charSequence is a valid unicode string and the number of unicode characters in the truncated charSequence is lesser than or equal to lengthstatic Literal<CharSequence>
truncateStringMax
(Literal<CharSequence> input, int length) Returns a valid unicode charsequence that is greater than the given input such that the number of unicode characters in the truncated charSequence is lesser than or equal to lengthstatic Literal<CharSequence>
truncateStringMin
(Literal<CharSequence> input, int length) Returns a valid unicode charsequence that is lower than the given input such that the number of unicode characters in the truncated charSequence is lesser than or equal to length
-
Method Details
-
isCharHighSurrogate
public static boolean isCharHighSurrogate(char ch) Determines if the given character value is a unicode high-surrogate code unit. The range of high-surrogates is 0xD800 - 0xDBFF. -
truncateString
Truncates the input charSequence such that the truncated charSequence is a valid unicode string and the number of unicode characters in the truncated charSequence is lesser than or equal to length -
truncateStringMin
Returns a valid unicode charsequence that is lower than the given input such that the number of unicode characters in the truncated charSequence is lesser than or equal to length -
truncateStringMax
Returns a valid unicode charsequence that is greater than the given input such that the number of unicode characters in the truncated charSequence is lesser than or equal to length
-