TableMigrationUtil

java.lang.Object
- org.apache.iceberg.data.TableMigrationUtil

public class TableMigrationUtil
extends java.lang.Object

Method Summary

All Methods Static Methods Concrete Methods
Modifier and Type	Method and Description
`static java.util.List<DataFile>`	`listPartition(java.util.Map<java.lang.String,java.lang.String> partition, java.lang.String uri, java.lang.String format, PartitionSpec spec, org.apache.hadoop.conf.Configuration conf, MetricsConfig metricsConfig, NameMapping mapping)` Returns the data files in a partition by listing the partition location.
`static java.util.List<DataFile>`	`listPartition(java.util.Map<java.lang.String,java.lang.String> partition, java.lang.String partitionUri, java.lang.String format, PartitionSpec spec, org.apache.hadoop.conf.Configuration conf, MetricsConfig metricsSpec, NameMapping mapping, int parallelism)` Returns the data files in a partition by listing the partition location.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Method Detail

listPartition

public static java.util.List<DataFile> listPartition(java.util.Map<java.lang.String,java.lang.String> partition,
                                                     java.lang.String uri,
                                                     java.lang.String format,
                                                     PartitionSpec spec,
                                                     org.apache.hadoop.conf.Configuration conf,
                                                     MetricsConfig metricsConfig,
                                                     NameMapping mapping)

Returns the data files in a partition by listing the partition location.

For Parquet and ORC partitions, this will read metrics from the file footer. For Avro partitions, metrics other than row count are set to null.

Note: certain metrics, like NaN counts, that are only supported by Iceberg file writers but not file footers, will not be populated.

Parameters:: partition - map of column names to column values for the partition; uri - partition location URI; format - partition format, avro, parquet or orc; spec - a partition spec; conf - a Hadoop conf; metricsConfig - a metrics conf; mapping - a name mapping
Returns:: a List of DataFile

listPartition

public static java.util.List<DataFile> listPartition(java.util.Map<java.lang.String,java.lang.String> partition,
                                                     java.lang.String partitionUri,
                                                     java.lang.String format,
                                                     PartitionSpec spec,
                                                     org.apache.hadoop.conf.Configuration conf,
                                                     MetricsConfig metricsSpec,
                                                     NameMapping mapping,
                                                     int parallelism)

Returns the data files in a partition by listing the partition location. Metrics are read from the files and the file reading is done in parallel by a specified number of threads.

For Parquet and ORC partitions, this will read metrics from the file footer. For Avro partitions, metrics other than row count are set to null.

Note: certain metrics, like NaN counts, that are only supported by Iceberg file writers but not file footers, will not be populated.

Parameters:: partition - map of column names to column values for the partition; partitionUri - partition location URI; format - partition format, avro, parquet or orc; spec - a partition spec; conf - a Hadoop conf; metricsSpec - a metrics conf; mapping - a name mapping; parallelism - number of threads to use for file reading
Returns:: a List of DataFile

Class TableMigrationUtil

Method Summary

Methods inherited from class java.lang.Object

Method Detail

listPartition

listPartition