Class SparkReadConf

java.lang.Object
org.apache.iceberg.spark.SparkReadConf

public class SparkReadConf extends Object
A class for common Iceberg configs for Spark reads.

If a config is set at multiple levels, the following order of precedence is used (top to bottom):

  1. Read options
  2. Session configuration
  3. Table metadata
The most specific value is set in read options and takes precedence over all other configs. If no read option is provided, this class checks the session configuration for any overrides. If no applicable value is found in the session configuration, this class uses the table metadata.

Note this class is NOT meant to be serialized and sent to executors.

  • Constructor Details

    • SparkReadConf

      public SparkReadConf(org.apache.spark.sql.SparkSession spark, Table table, Map<String,String> readOptions)
    • SparkReadConf

      public SparkReadConf(org.apache.spark.sql.SparkSession spark, Table table, String branch, Map<String,String> readOptions)
  • Method Details

    • caseSensitive

      public boolean caseSensitive()
    • localityEnabled

      public boolean localityEnabled()
    • snapshotId

      public Long snapshotId()
    • asOfTimestamp

      public Long asOfTimestamp()
    • startSnapshotId

      public Long startSnapshotId()
    • endSnapshotId

      public Long endSnapshotId()
    • branch

      public String branch()
    • tag

      public String tag()
    • scanTaskSetId

      public String scanTaskSetId()
    • streamingSkipDeleteSnapshots

      public boolean streamingSkipDeleteSnapshots()
    • streamingSkipOverwriteSnapshots

      public boolean streamingSkipOverwriteSnapshots()
    • parquetVectorizationEnabled

      public boolean parquetVectorizationEnabled()
    • parquetBatchSize

      public int parquetBatchSize()
    • orcVectorizationEnabled

      public boolean orcVectorizationEnabled()
    • orcBatchSize

      public int orcBatchSize()
    • splitSizeOption

      public Long splitSizeOption()
    • splitSize

      public long splitSize()
    • splitLookbackOption

      public Integer splitLookbackOption()
    • splitLookback

      public int splitLookback()
    • splitOpenFileCostOption

      public Long splitOpenFileCostOption()
    • splitOpenFileCost

      public long splitOpenFileCost()
    • streamFromTimestamp

      public long streamFromTimestamp()
    • startTimestamp

      public Long startTimestamp()
    • endTimestamp

      public Long endTimestamp()
    • maxFilesPerMicroBatch

      public int maxFilesPerMicroBatch()
    • maxRecordsPerMicroBatch

      public int maxRecordsPerMicroBatch()
    • preserveDataGrouping

      public boolean preserveDataGrouping()
    • aggregatePushDownEnabled

      public boolean aggregatePushDownEnabled()
    • adaptiveSplitSizeEnabled

      public boolean adaptiveSplitSizeEnabled()
    • parallelism

      public int parallelism()
    • distributedPlanningEnabled

      public boolean distributedPlanningEnabled()
    • dataPlanningMode

      public PlanningMode dataPlanningMode()
    • deletePlanningMode

      public PlanningMode deletePlanningMode()
    • executorCacheLocalityEnabled

      public boolean executorCacheLocalityEnabled()
    • reportColumnStats

      public boolean reportColumnStats()