Class UnicodeUtil

java.lang.Object
org.apache.iceberg.util.UnicodeUtil

public class UnicodeUtil extends Object
  • Method Details

    • isCharHighSurrogate

      public static boolean isCharHighSurrogate(char ch)
      Determines if the given character value is a unicode high-surrogate code unit. The range of high-surrogates is 0xD800 - 0xDBFF.
    • truncateString

      public static CharSequence truncateString(CharSequence input, int length)
      Truncates the input charSequence such that the truncated charSequence is a valid unicode string and the number of unicode characters in the truncated charSequence is lesser than or equal to length
    • truncateStringMin

      public static Literal<CharSequence> truncateStringMin(Literal<CharSequence> input, int length)
      Returns a valid unicode charsequence that is lower than the given input such that the number of unicode characters in the truncated charSequence is lesser than or equal to length
    • truncateStringMax

      public static Literal<CharSequence> truncateStringMax(Literal<CharSequence> input, int length)
      Returns a valid unicode charsequence that is greater than the given input such that the number of unicode characters in the truncated charSequence is lesser than or equal to length