public class TypeUtil
extends java.lang.Object
Modifier and Type | Class and Description |
---|---|
static class |
TypeUtil.CustomOrderSchemaVisitor<T> |
static interface |
TypeUtil.NextID
Interface for passing a function that assigns column IDs.
|
static class |
TypeUtil.SchemaVisitor<T> |
Modifier and Type | Method and Description |
---|---|
static Schema |
assignFreshIds(int schemaId,
Schema schema,
TypeUtil.NextID nextId)
Assigns fresh ids from the
nextId function for all fields in a schema. |
static Schema |
assignFreshIds(Schema schema,
Schema baseSchema,
TypeUtil.NextID nextId)
Assigns ids to match a given schema, and fresh ids from the
nextId function for
all other fields. |
static Schema |
assignFreshIds(Schema schema,
TypeUtil.NextID nextId)
Assigns fresh ids from the
nextId function for all fields in a schema. |
static Type |
assignFreshIds(Type type,
TypeUtil.NextID nextId)
Assigns fresh ids from the
nextId function for all fields in a type. |
static Schema |
assignIncreasingFreshIds(Schema schema)
Assigns strictly increasing fresh ids for all fields in a schema, starting from 1.
|
static int |
decimalRequiredBytes(int precision) |
static int |
estimateSize(Types.NestedField field)
Estimates the number of bytes a value for a given field may occupy in memory.
|
static Type |
find(Schema schema,
java.util.function.Predicate<Type> predicate) |
static java.util.Set<java.lang.Integer> |
getProjectedIds(Schema schema) |
static java.util.Set<java.lang.Integer> |
getProjectedIds(Type type) |
static java.util.Map<java.lang.Integer,Types.NestedField> |
indexById(Types.StructType struct) |
static java.util.Map<java.lang.String,java.lang.Integer> |
indexByLowerCaseName(Types.StructType struct) |
static java.util.Map<java.lang.String,java.lang.Integer> |
indexByName(Types.StructType struct) |
static java.util.Map<java.lang.Integer,java.lang.String> |
indexNameById(Types.StructType struct) |
static java.util.Map<java.lang.Integer,java.lang.Integer> |
indexParents(Types.StructType struct) |
static java.util.Map<java.lang.Integer,java.lang.String> |
indexQuotedNameById(Types.StructType struct,
java.util.function.Function<java.lang.String,java.lang.String> quotingFunc) |
static boolean |
isPromotionAllowed(Type from,
Type.PrimitiveType to) |
static Schema |
join(Schema left,
Schema right) |
static Schema |
project(Schema schema,
java.util.Set<java.lang.Integer> fieldIds)
Project extracts particular fields from a schema by ID.
|
static Types.StructType |
project(Types.StructType struct,
java.util.Set<java.lang.Integer> fieldIds) |
static Schema |
reassignDoc(Schema schema,
Schema docSourceSchema)
Reassigns doc in a schema from another schema.
|
static Schema |
reassignIds(Schema schema,
Schema idSourceSchema)
Reassigns ids in a schema from another schema.
|
static Schema |
reassignIds(Schema schema,
Schema idSourceSchema,
boolean caseSensitive)
Reassigns ids in a schema from another schema.
|
static Schema |
reassignOrRefreshIds(Schema schema,
Schema idSourceSchema) |
static Schema |
reassignOrRefreshIds(Schema schema,
Schema idSourceSchema,
boolean caseSensitive) |
static java.util.Set<java.lang.Integer> |
refreshIdentifierFields(Types.StructType freshSchema,
Schema baseSchema)
Get the identifier fields in the fresh schema based on the identifier fields in the base
schema.
|
static Schema |
select(Schema schema,
java.util.Set<java.lang.Integer> fieldIds) |
static Types.StructType |
select(Types.StructType struct,
java.util.Set<java.lang.Integer> fieldIds) |
static Schema |
selectNot(Schema schema,
java.util.Set<java.lang.Integer> fieldIds) |
static Types.StructType |
selectNot(Types.StructType struct,
java.util.Set<java.lang.Integer> fieldIds) |
static void |
validateSchema(java.lang.String context,
Schema expectedSchema,
Schema providedSchema,
boolean checkNullability,
boolean checkOrdering)
Validates whether the provided schema is compatible with the expected schema.
|
static void |
validateWriteSchema(Schema tableSchema,
Schema writeSchema,
java.lang.Boolean checkNullability,
java.lang.Boolean checkOrdering)
Check whether we could write the iceberg table with the user-provided write schema.
|
static <T> T |
visit(Schema schema,
TypeUtil.CustomOrderSchemaVisitor<T> visitor) |
static <T> T |
visit(Schema schema,
TypeUtil.SchemaVisitor<T> visitor) |
static <T> T |
visit(Type type,
TypeUtil.CustomOrderSchemaVisitor<T> visitor)
Used to traverse types with traversals other than post-order.
|
static <T> T |
visit(Type type,
TypeUtil.SchemaVisitor<T> visitor) |
public static Schema project(Schema schema, java.util.Set<java.lang.Integer> fieldIds)
Unlike select(Schema, Set)
, project will pick out only the fields
enumerated. Structs that are explicitly projected are empty unless sub-fields are explicitly
projected. Maps and lists cannot be explicitly selected in fieldIds.
schema
- to project fields fromfieldIds
- list of explicit fields to extractpublic static Types.StructType project(Types.StructType struct, java.util.Set<java.lang.Integer> fieldIds)
public static Types.StructType select(Types.StructType struct, java.util.Set<java.lang.Integer> fieldIds)
public static java.util.Set<java.lang.Integer> getProjectedIds(Schema schema)
public static java.util.Set<java.lang.Integer> getProjectedIds(Type type)
public static Types.StructType selectNot(Types.StructType struct, java.util.Set<java.lang.Integer> fieldIds)
public static java.util.Map<java.lang.String,java.lang.Integer> indexByName(Types.StructType struct)
public static java.util.Map<java.lang.Integer,java.lang.String> indexNameById(Types.StructType struct)
public static java.util.Map<java.lang.Integer,java.lang.String> indexQuotedNameById(Types.StructType struct, java.util.function.Function<java.lang.String,java.lang.String> quotingFunc)
public static java.util.Map<java.lang.String,java.lang.Integer> indexByLowerCaseName(Types.StructType struct)
public static java.util.Map<java.lang.Integer,Types.NestedField> indexById(Types.StructType struct)
public static java.util.Map<java.lang.Integer,java.lang.Integer> indexParents(Types.StructType struct)
public static Type assignFreshIds(Type type, TypeUtil.NextID nextId)
nextId function
for all fields in a type.type
- a typenextId
- an id assignment functionpublic static Schema assignFreshIds(Schema schema, TypeUtil.NextID nextId)
nextId function
for all fields in a schema.schema
- a schemanextId
- an id assignment functionpublic static Schema assignFreshIds(int schemaId, Schema schema, TypeUtil.NextID nextId)
nextId function
for all fields in a schema.schemaId
- an ID assigned to this schemaschema
- a schemanextId
- an id assignment functionpublic static Schema assignFreshIds(Schema schema, Schema baseSchema, TypeUtil.NextID nextId)
nextId function
for
all other fields.schema
- a schemabaseSchema
- a schema with existing IDs to copy by namenextId
- an id assignment functionpublic static java.util.Set<java.lang.Integer> refreshIdentifierFields(Types.StructType freshSchema, Schema baseSchema)
freshSchema
- fresh schemabaseSchema
- base schemapublic static Schema assignIncreasingFreshIds(Schema schema)
schema
- a schemapublic static Schema reassignIds(Schema schema, Schema idSourceSchema)
Ids are determined by field names. If a field in the schema cannot be found in the source schema, this will throw IllegalArgumentException.
This will not alter a schema's structure, nullability, or types.
schema
- the schema to have ids reassignedidSourceSchema
- the schema from which field ids will be usedjava.lang.IllegalArgumentException
- if a field cannot be found (by name) in the source schemapublic static Schema reassignDoc(Schema schema, Schema docSourceSchema)
Doc are determined by field id. If a field in the schema cannot be found in the source schema, this will throw IllegalArgumentException.
This will not alter a schema's structure, nullability, or types.
schema
- the schema to have doc reassigneddocSourceSchema
- the schema from which field doc will be usedjava.lang.IllegalArgumentException
- if a field cannot be found (by id) in the source schemapublic static Schema reassignIds(Schema schema, Schema idSourceSchema, boolean caseSensitive)
Ids are determined by field names. If a field in the schema cannot be found in the source schema, this will throw IllegalArgumentException.
This will not alter a schema's structure, nullability, or types.
schema
- the schema to have ids reassignedidSourceSchema
- the schema from which field ids will be usedjava.lang.IllegalArgumentException
- if a field cannot be found (by name) in the source schemapublic static Schema reassignOrRefreshIds(Schema schema, Schema idSourceSchema)
public static Schema reassignOrRefreshIds(Schema schema, Schema idSourceSchema, boolean caseSensitive)
public static boolean isPromotionAllowed(Type from, Type.PrimitiveType to)
public static void validateWriteSchema(Schema tableSchema, Schema writeSchema, java.lang.Boolean checkNullability, java.lang.Boolean checkOrdering)
tableSchema
- the table schema written in iceberg meta data.writeSchema
- the user-provided write schema.checkNullability
- If true, not allow to write optional values to a required field.checkOrdering
- If true, not allow input schema to have different ordering than table
schema.public static void validateSchema(java.lang.String context, Schema expectedSchema, Schema providedSchema, boolean checkNullability, boolean checkOrdering)
context
- the schema context (e.g. row ID)expectedSchema
- the expected schemaprovidedSchema
- the provided schemacheckNullability
- whether to check field nullabilitycheckOrdering
- whether to check field orderingpublic static int estimateSize(Types.NestedField field)
This method approximates the memory size based on heuristics and the internal Java
representation defined by Type.TypeID
. It is important to note that the actual size
might differ from this estimation. The method is designed to handle a variety of data types,
including primitive types, strings, and nested types such as structs, maps, and lists.
field
- a field for which to estimate the sizepublic static <T> T visit(Schema schema, TypeUtil.SchemaVisitor<T> visitor)
public static <T> T visit(Type type, TypeUtil.SchemaVisitor<T> visitor)
public static <T> T visit(Schema schema, TypeUtil.CustomOrderSchemaVisitor<T> visitor)
public static <T> T visit(Type type, TypeUtil.CustomOrderSchemaVisitor<T> visitor)
This passes a Supplier
to each visitor
method that
returns the result of traversing child types. Structs are passed an Iterable
that
traverses child fields during iteration.
An example use is assigning column IDs, which should be done with a pre-order traversal.
T
- the type returned by the visitortype
- a type to traverse with a visitorvisitor
- a custom order visitorpublic static int decimalRequiredBytes(int precision)