All Superinterfaces:: PendingUpdate<Schema>

public interface UpdateSchema extends PendingUpdate<Schema>

API for schema evolution.

When committing, these changes will be applied to the current table metadata. Commit conflicts will not be resolved and will result in a CommitFailedException.

Method Summary

Modifier and Type

Method

Description

default UpdateSchema

addColumn(String parent, String name, Type type)

Add a new column to a nested struct.

UpdateSchema

addColumn(String parent, String name, Type type, String doc)

Add a new column to a nested struct.

default UpdateSchema

addColumn(String name, Type type)

Add a new top-level column.

UpdateSchema

addColumn(String name, Type type, String doc)

Add a new top-level column.

default UpdateSchema

addRequiredColumn(String parent, String name, Type type)

Add a new required top-level column.

UpdateSchema

addRequiredColumn(String parent, String name, Type type, String doc)

Add a new required top-level column.

default UpdateSchema

addRequiredColumn(String name, Type type)

Add a new required top-level column.

UpdateSchema

addRequiredColumn(String name, Type type, String doc)

Add a new required top-level column.

UpdateSchema

allowIncompatibleChanges()

Allow incompatible changes to the schema.

default UpdateSchema

caseSensitive(boolean caseSensitive)

Determines if the case of schema needs to be considered when comparing column names

UpdateSchema

deleteColumn(String name)

Delete a column in the schema.

UpdateSchema

makeColumnOptional(String name)

Update a column to optional.

UpdateSchema

moveAfter(String name, String afterName)

Move a column from its current position to directly after a reference column.

UpdateSchema

moveBefore(String name, String beforeName)

Move a column from its current position to directly before a reference column.

UpdateSchema

moveFirst(String name)

Move a column from its current position to the start of the schema or its parent struct.

UpdateSchema

renameColumn(String name, String newName)

Rename a column in the schema.

UpdateSchema

requireColumn(String name)

Update a column to required.

default UpdateSchema

setIdentifierFields(String... names)

Set the identifier fields given some field names.

UpdateSchema

setIdentifierFields(Collection<String> names)

Set the identifier fields given a set of field names.

UpdateSchema

unionByNameWith(Schema newSchema)

Applies all field additions and updates from the provided new schema to the existing schema so to create a union schema.

UpdateSchema

updateColumn(String name, Type.PrimitiveType newType)

Update a column in the schema to a new primitive type.

default UpdateSchema

updateColumn(String name, Type.PrimitiveType newType, String newDoc)

Update a column in the schema to a new primitive type.

UpdateSchema

updateColumnDoc(String name, String newDoc)

Update the documentation string for a column.

Methods inherited from interface org.apache.iceberg.PendingUpdate
apply, commit, updateEvent

Method Details
- allowIncompatibleChanges
  
  UpdateSchema allowIncompatibleChanges()
  
  Allow incompatible changes to the schema.
  Incompatible changes can cause failures when attempting to read older data files. For example, adding a required column and attempting to read data files without that column will cause a failure. However, if there are no data files that are not compatible with the change, it can be allowed.
  This option allows incompatible changes to be made to a schema. This should be used when the caller has validated that the change will not break. For example, if a column is added as optional but always populated and data older than the column addition has been deleted from the table, this can be used with requireColumn(String) to mark the column required.
  
  Returns:
  
  this for method chaining
- addColumn
  
  default UpdateSchema addColumn(String name, Type type)
  
  Add a new top-level column.
  Because "." may be interpreted as a column path separator or may be used in field names, it is not allowed in names passed to this method. To add to nested structures or to add fields with names that contain ".", use addColumn(String, String, Type).
  If type is a nested type, its field IDs are reassigned when added to the existing schema.
  
  Parameters:
  
  name - name for the new column
  
  type - type for the new column
  
  Returns:
  
  this for method chaining
  
  Throws:
  
  IllegalArgumentException - If name contains "."
- addColumn
  
  UpdateSchema addColumn(String name, Type type, String doc)
  
  Add a new top-level column.
  Because "." may be interpreted as a column path separator or may be used in field names, it is not allowed in names passed to this method. To add to nested structures or to add fields with names that contain ".", use addColumn(String, String, Type).
  If type is a nested type, its field IDs are reassigned when added to the existing schema.
  
  Parameters:
  
  name - name for the new column
  
  type - type for the new column
  
  doc - documentation string for the new column
  
  Returns:
  
  this for method chaining
  
  Throws:
  
  IllegalArgumentException - If name contains "."
- addColumn
  
  default UpdateSchema addColumn(String parent, String name, Type type)
  
  Add a new column to a nested struct.
  The parent name is used to find the parent using Schema.findField(String). If the parent name is null, the new column will be added to the root as a top-level column. If parent identifies a struct, a new column is added to that struct. If it identifies a list, the column is added to the list element struct, and if it identifies a map, the new column is added to the map's value struct.
  The given name is used to name the new column and names containing "." are not handled differently.
  If type is a nested type, its field IDs are reassigned when added to the existing schema.
  
  Parameters:
  
  parent - name of the parent struct to the column will be added to
  
  name - name for the new column
  
  type - type for the new column
  
  Returns:
  
  this for method chaining
  
  Throws:
  
  IllegalArgumentException - If parent doesn't identify a struct
- addColumn
  
  UpdateSchema addColumn(String parent, String name, Type type, String doc)
  
  Add a new column to a nested struct.
  The parent name is used to find the parent using Schema.findField(String). If the parent name is null, the new column will be added to the root as a top-level column. If parent identifies a struct, a new column is added to that struct. If it identifies a list, the column is added to the list element struct, and if it identifies a map, the new column is added to the map's value struct.
  The given name is used to name the new column and names containing "." are not handled differently.
  If type is a nested type, its field IDs are reassigned when added to the existing schema.
  
  Parameters:
  
  parent - name of the parent struct to the column will be added to
  
  name - name for the new column
  
  type - type for the new column
  
  doc - documentation string for the new column
  
  Returns:
  
  this for method chaining
  
  Throws:
  
  IllegalArgumentException - If parent doesn't identify a struct
- addRequiredColumn
  
  default UpdateSchema addRequiredColumn(String name, Type type)
  
  Add a new required top-level column.
  This is an incompatible change that can break reading older data. This method will result in an exception unless allowIncompatibleChanges() has been called.
  Because "." may be interpreted as a column path separator or may be used in field names, it is not allowed in names passed to this method. To add to nested structures or to add fields with names that contain ".", use addRequiredColumn(String, String, Type).
  If type is a nested type, its field IDs are reassigned when added to the existing schema.
  
  Parameters:
  
  name - name for the new column
  
  type - type for the new column
  
  Returns:
  
  this for method chaining
  
  Throws:
  
  IllegalArgumentException - If name contains "."
- addRequiredColumn
  
  UpdateSchema addRequiredColumn(String name, Type type, String doc)
  
  Add a new required top-level column.
  This is an incompatible change that can break reading older data. This method will result in an exception unless allowIncompatibleChanges() has been called.
  Because "." may be interpreted as a column path separator or may be used in field names, it is not allowed in names passed to this method. To add to nested structures or to add fields with names that contain ".", use addRequiredColumn(String, String, Type).
  If type is a nested type, its field IDs are reassigned when added to the existing schema.
  
  Parameters:
  
  name - name for the new column
  
  type - type for the new column
  
  doc - documentation string for the new column
  
  Returns:
  
  this for method chaining
  
  Throws:
  
  IllegalArgumentException - If name contains "."
- addRequiredColumn
  
  default UpdateSchema addRequiredColumn(String parent, String name, Type type)
  
  Add a new required top-level column.
  This is an incompatible change that can break reading older data. This method will result in an exception unless allowIncompatibleChanges() has been called.
  The parent name is used to find the parent using Schema.findField(String). If the parent name is null, the new column will be added to the root as a top-level column. If parent identifies a struct, a new column is added to that struct. If it identifies a list, the column is added to the list element struct, and if it identifies a map, the new column is added to the map's value struct.
  The given name is used to name the new column and names containing "." are not handled differently.
  If type is a nested type, its field IDs are reassigned when added to the existing schema.
  
  Parameters:
  
  parent - name of the parent struct to the column will be added to
  
  name - name for the new column
  
  type - type for the new column
  
  Returns:
  
  this for method chaining
  
  Throws:
  
  IllegalArgumentException - If parent doesn't identify a struct
- addRequiredColumn
  
  UpdateSchema addRequiredColumn(String parent, String name, Type type, String doc)
  
  Add a new required top-level column.
  This is an incompatible change that can break reading older data. This method will result in an exception unless allowIncompatibleChanges() has been called.
  The parent name is used to find the parent using Schema.findField(String). If the parent name is null, the new column will be added to the root as a top-level column. If parent identifies a struct, a new column is added to that struct. If it identifies a list, the column is added to the list element struct, and if it identifies a map, the new column is added to the map's value struct.
  The given name is used to name the new column and names containing "." are not handled differently.
  If type is a nested type, its field IDs are reassigned when added to the existing schema.
  
  Parameters:
  
  parent - name of the parent struct to the column will be added to
  
  name - name for the new column
  
  type - type for the new column
  
  doc - documentation string for the new column
  
  Returns:
  
  this for method chaining
  
  Throws:
  
  IllegalArgumentException - If parent doesn't identify a struct
- renameColumn
  
  UpdateSchema renameColumn(String name, String newName)
  
  Rename a column in the schema.
  The name is used to find the column to rename using Schema.findField(String).
  The new name may contain "." and such names are not parsed or handled differently.
  Columns may be updated and renamed in the same schema update.
  
  Parameters:
  
  name - name of the column to rename
  
  newName - replacement name for the column
  
  Returns:
  
  this for method chaining
  
  Throws:
  
  IllegalArgumentException - If name doesn't identify a column in the schema or if this change conflicts with other additions, renames, or updates.
- updateColumn
  
  UpdateSchema updateColumn(String name, Type.PrimitiveType newType)
  
  Update a column in the schema to a new primitive type.
  The name is used to find the column to update using Schema.findField(String).
  Only updates that widen types are allowed.
  Columns may be updated and renamed in the same schema update.
  
  Parameters:
  
  name - name of the column to rename
  
  newType - replacement type for the column
  
  Returns:
  
  this for method chaining
  
  Throws:
  
  IllegalArgumentException - If name doesn't identify a column in the schema or if this change introduces a type incompatibility or if it conflicts with other additions, renames, or updates.
- updateColumn
  
  default UpdateSchema updateColumn(String name, Type.PrimitiveType newType, String newDoc)
  
  Update a column in the schema to a new primitive type.
  The name is used to find the column to update using Schema.findField(String).
  Only updates that widen types are allowed.
  Columns may be updated and renamed in the same schema update.
  
  Parameters:
  
  name - name of the column to rename
  
  newType - replacement type for the column
  
  newDoc - replacement documentation string for the column
  
  Returns:
  
  this for method chaining
  
  Throws:
  
  IllegalArgumentException - If name doesn't identify a column in the schema or if this change introduces a type incompatibility or if it conflicts with other additions, renames, or updates.
- updateColumnDoc
  
  UpdateSchema updateColumnDoc(String name, String newDoc)
  
  Update the documentation string for a column.
  The name is used to find the column to update using Schema.findField(String).
  
  Parameters:
  
  name - name of the column to update the documentation string for
  
  newDoc - replacement documentation string for the column
  
  Returns:
  
  this for method chaining
  
  Throws:
  
  IllegalArgumentException - If name doesn't identify a column in the schema or if the column will be deleted
- makeColumnOptional
  
  UpdateSchema makeColumnOptional(String name)
  
  Update a column to optional.
  
  Parameters:
  
  name - name of the column to mark optional
  
  Returns:
  
  this for method chaining
- requireColumn
  
  UpdateSchema requireColumn(String name)
  
  Update a column to required.
  This is an incompatible change that can break reading older data. This method will result in an exception unless allowIncompatibleChanges() has been called.
  
  Parameters:
  
  name - name of the column to mark required
  
  Returns:
  
  this for method chaining
- deleteColumn
  
  UpdateSchema deleteColumn(String name)
  
  Delete a column in the schema.
  The name is used to find the column to delete using Schema.findField(String).
  
  Parameters:
  
  name - name of the column to delete
  
  Returns:
  
  this for method chaining
  
  Throws:
  
  IllegalArgumentException - If name doesn't identify a column in the schema or if this change conflicts with other additions, renames, or updates.
- moveFirst
  
  UpdateSchema moveFirst(String name)
  
  Move a column from its current position to the start of the schema or its parent struct.
  
  Parameters:
  
  name - name of the column to move
  
  Returns:
  
  this for method chaining
  
  Throws:
  
  IllegalArgumentException - If name doesn't identify a column in the schema or if this change conflicts with other changes.
- moveBefore
  
  UpdateSchema moveBefore(String name, String beforeName)
  
  Move a column from its current position to directly before a reference column.
  The name is used to find the column to move using Schema.findField(String). If the name identifies a nested column, it can only be moved within the nested struct that contains it.
  
  Parameters:
  
  name - name of the column to move
  
  beforeName - name of the reference column
  
  Returns:
  
  this for method chaining
  
  Throws:
  
  IllegalArgumentException - If name doesn't identify a column in the schema or if this change conflicts with other changes.
- moveAfter
  
  UpdateSchema moveAfter(String name, String afterName)
  
  Move a column from its current position to directly after a reference column.
  The name is used to find the column to move using Schema.findField(String). If the name identifies a nested column, it can only be moved within the nested struct that contains it.
  
  Parameters:
  
  name - name of the column to move
  
  afterName - name of the reference column
  
  Returns:
  
  this for method chaining
  
  Throws:
  
  IllegalArgumentException - If name doesn't identify a column in the schema or if this change conflicts with other changes.
- unionByNameWith
  
  UpdateSchema unionByNameWith(Schema newSchema)
  
  Applies all field additions and updates from the provided new schema to the existing schema so to create a union schema.
  For fields with same canonical names in both schemas it is required that the widen types is supported using updateColumn(String, Type.PrimitiveType)
  Only supports turning a previously required field into an optional one if it is marked optional in the provided new schema using makeColumnOptional(String)
  Only supports updating existing field docs with fields docs from the provided new schema using updateColumnDoc(String, String)
  
  Parameters:
  
  newSchema - a schema used in conjunction with the existing schema to create a union schema
  
  Returns:
  
  this for method chaining
  
  Throws:
  
  IllegalStateException - If it encounters errors during provided schema traversal
  
  IllegalArgumentException - If name doesn't identify a column in the schema or if this change introduces a type incompatibility or if it conflicts with other additions, renames, or updates.
- setIdentifierFields
  
  UpdateSchema setIdentifierFields(Collection<String> names)
  
  Set the identifier fields given a set of field names.
  Because identifier fields are unique, duplicated names will be ignored. See Schema.identifierFieldIds() to learn more about Iceberg identifier.
  
  Parameters:
  
  names - names of the columns to set as identifier fields
  
  Returns:
  
  this for method chaining
- setIdentifierFields
  
  default UpdateSchema setIdentifierFields(String... names)
  
  Set the identifier fields given some field names. See setIdentifierFields(Collection) for more details.
  
  Parameters:
  
  names - names of the columns to set as identifier fields
  
  Returns:
  
  this for method chaining
- caseSensitive
  
  default UpdateSchema caseSensitive(boolean caseSensitive)
  
  Determines if the case of schema needs to be considered when comparing column names
  
  Parameters:
  
  caseSensitive - when false case is not considered in column name comparisons.
  
  Returns:
  
  this for method chaining

Interface UpdateSchema

Method Summary

Methods inherited from interface org.apache.iceberg.PendingUpdate

Method Details

allowIncompatibleChanges

addColumn

addColumn

addColumn

addColumn

addRequiredColumn

addRequiredColumn

addRequiredColumn

addRequiredColumn

renameColumn

updateColumn

updateColumn

updateColumnDoc

makeColumnOptional

requireColumn

deleteColumn

moveFirst

moveBefore

moveAfter

unionByNameWith

setIdentifierFields

setIdentifierFields

caseSensitive