Interface RewriteManifests

All Superinterfaces:
Action<RewriteManifests,RewriteManifests.Result>, SnapshotUpdate<RewriteManifests,RewriteManifests.Result>
All Known Implementing Classes:
RewriteManifestsSparkAction

public interface RewriteManifests extends SnapshotUpdate<RewriteManifests,RewriteManifests.Result>
An action that rewrites manifests.
  • Method Details

    • specId

      RewriteManifests specId(int specId)
      Rewrites manifests for a given spec id.

      If not set, defaults to the table's default spec ID.

      Parameters:
      specId - a spec id
      Returns:
      this for method chaining
    • rewriteIf

      RewriteManifests rewriteIf(Predicate<ManifestFile> predicate)
      Rewrites only manifests that match the given predicate.

      If not set, all manifests will be rewritten.

      Parameters:
      predicate - a predicate
      Returns:
      this for method chaining
    • sortBy

      default RewriteManifests sortBy(List<String> partitionFields)
      Rewrite manifests in a given order, based on partition field names

      Supply an optional set of partition field names to sort the rewritten manifests by. Choosing a frequently queried partition field can reduce planning time by skipping unnecessary manifests.

      For example, given a table PARTITIONED BY (a, b, c, d), one may wish to rewrite and sort manifests by ('d', 'b') only, based on known query patterns. Rewriting Manifests in this way will yield a manifest_list whose manifest_files point to data files containing common 'd' then 'b' partition values.

      If not set, manifests will be rewritten in the order of the transforms in the table's partition spec.

      Parameters:
      partitionFields - Exact transformed column names used for partitioning; not the raw column names that partitions are derived from. E.G. supply 'data_bucket' and not 'data' for a bucket(N, data) partition * definition
      Returns:
      this for method chaining
    • stagingLocation

      RewriteManifests stagingLocation(String stagingLocation)
      Passes a location where the staged manifests should be written.

      If not set, defaults to the table's metadata location.

      Parameters:
      stagingLocation - a staging location
      Returns:
      this for method chaining