Class MetricsUtil

java.lang.Object
org.apache.iceberg.MetricsUtil

public class MetricsUtil extends Object
  • Field Details

  • Method Details

    • copyWithoutFieldCounts

      public static Metrics copyWithoutFieldCounts(Metrics metrics, Set<Integer> excludedFieldIds)
      Copies a metrics object without value, NULL and NaN counts for given fields.
      Parameters:
      excludedFieldIds - field IDs for which the counts must be dropped
      Returns:
      a new metrics object without counts for given fields
    • copyWithoutFieldCountsAndBounds

      public static Metrics copyWithoutFieldCountsAndBounds(Metrics metrics, Set<Integer> excludedFieldIds)
      Copies a metrics object without counts and bounds for given fields.
      Parameters:
      excludedFieldIds - field IDs for which the counts and bounds must be dropped
      Returns:
      a new metrics object without lower and upper bounds for given fields
    • createNanValueCounts

      public static Map<Integer,Long> createNanValueCounts(Stream<FieldMetrics<?>> fieldMetrics, MetricsConfig metricsConfig, Schema inputSchema)
      Construct mapping relationship between column id to NaN value counts from input metrics and metrics config.
    • metricsMode

      public static MetricsModes.MetricsMode metricsMode(Schema inputSchema, MetricsConfig metricsConfig, int fieldId)
      Extract MetricsMode for the given field id from metrics config.
    • readableMetricsSchema

      public static Schema readableMetricsSchema(Schema dataTableSchema, Schema metadataTableSchema)
      Calculates a dynamic schema for readable_metrics to add to metadata tables. The type will be the struct MetricsUtil.ReadableColMetricsStruct, composed of MetricsUtil.ReadableMetricsStruct for all primitive columns in the data table
      Parameters:
      dataTableSchema - schema of data table
      metadataTableSchema - schema of existing metadata table (to ensure id uniqueness)
      Returns:
      schema of readable_metrics struct
    • readableMetricsStruct

      public static MetricsUtil.ReadableMetricsStruct readableMetricsStruct(Schema schema, ContentFile<?> file, Types.StructType projectedSchema)
      Return a readable metrics struct row from file metadata
      Parameters:
      schema - schema of original data table
      file - content file with metrics
      projectedSchema - user requested projection
      Returns:
      MetricsUtil.ReadableMetricsStruct