Class PartitionStatsHandler

java.lang.Object
org.apache.iceberg.PartitionStatsHandler

public class PartitionStatsHandler extends Object
Computes, writes and reads the PartitionStatisticsFile. Uses generic readers and writers to support writing and reading of the stats in table default format.
  • Field Details

  • Method Details

    • schema

      @Deprecated public static Schema schema(Types.StructType unifiedPartitionType)
      Deprecated.
      since 1.10.0, will be removed in 1.11.0. Use schema(StructType, int) instead.
      Generates the partition stats file schema based on a combined partition type which considers all specs in a table.

      Use this only for format version 1 and 2. For version 3 and above use schema(StructType, int)

      Parameters:
      unifiedPartitionType - unified partition schema type. Could be calculated by Partitioning.partitionType(Table).
      Returns:
      a schema that corresponds to the provided unified partition type.
    • schema

      public static Schema schema(Types.StructType unifiedPartitionType, int formatVersion)
      Generates the partition stats file schema for a given format version based on a combined partition type which considers all specs in a table.
      Parameters:
      unifiedPartitionType - unified partition schema type. Could be calculated by Partitioning.partitionType(Table).
      Returns:
      a schema that corresponds to the provided unified partition type.
    • computeAndWriteStatsFile

      public static PartitionStatisticsFile computeAndWriteStatsFile(Table table) throws IOException
      Computes the stats incrementally after the snapshot that has partition stats file till the current snapshot and writes the combined result into a PartitionStatisticsFile after merging the stats for a given table's current snapshot.

      Does a full compute if previous statistics file does not exist.

      Parameters:
      table - The Table for which the partition statistics is computed.
      Returns:
      PartitionStatisticsFile for the current snapshot, or null if no statistics are present.
      Throws:
      IOException
    • computeAndWriteStatsFile

      public static PartitionStatisticsFile computeAndWriteStatsFile(Table table, long snapshotId) throws IOException
      Computes the stats incrementally after the snapshot that has partition stats file till the given snapshot and writes the combined result into a PartitionStatisticsFile after merging the stats for a given snapshot.

      Does a full compute if previous statistics file does not exist.

      Parameters:
      table - The Table for which the partition statistics is computed.
      snapshotId - snapshot for which partition statistics are computed.
      Returns:
      PartitionStatisticsFile for the given snapshot, or null if no statistics are present.
      Throws:
      IOException
    • readPartitionStatsFile

      public static CloseableIterable<PartitionStats> readPartitionStatsFile(Schema schema, InputFile inputFile)
      Reads partition statistics from the specified InputFile using given schema.
      Parameters:
      schema - The Schema of the partition statistics file.
      inputFile - An InputFile pointing to the partition stats file.