Package org.apache.iceberg
Class SnapshotScan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>
java.lang.Object
org.apache.iceberg.SnapshotScan<ThisT,T,G>
- Type Parameters:
ThisT- actual BaseScan implementation class typeT- type of ScanTask returnedG- type of ScanTaskGroup returned
- All Implemented Interfaces:
Scan<ThisT,T, G>
- Direct Known Subclasses:
AllDataFilesTable.AllDataFilesTableScan,AllDeleteFilesTable.AllDeleteFilesTableScan,AllFilesTable.AllFilesTableScan,AllManifestsTable.AllManifestsTableScan,DataFilesTable.DataFilesTableScan,DataTableScan,DeleteFilesTable.DeleteFilesTableScan,FilesTable.FilesTableScan,PositionDeletesTable.PositionDeletesBatchScan,SparkDistributedDataScan
public abstract class SnapshotScan<ThisT,T extends ScanTask,G extends ScanTaskGroup<T>>
extends Object
This is a common base class to share code between different BaseScan implementations that handle
scans of a particular snapshot.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected static final boolean -
Constructor Summary
ConstructorsModifierConstructorDescriptionprotectedSnapshotScan(Table table, Schema schema, org.apache.iceberg.TableScanContext context) -
Method Summary
Modifier and TypeMethodDescriptionasOfTime(long timestampMillis) caseSensitive(boolean caseSensitive) Create a new scan from this that, if data columns where selected viaScan.select(java.util.Collection), controls whether the match to the schema will be done with case sensitivity.protected org.apache.iceberg.TableScanContextcontext()protected abstract CloseableIterable<T> filter()Returns this scan's filterExpression.filter(Expression expr) Create a new scan from the results of this filtered by theExpression.Create a new scan from this that applies data filtering to files but not to rows in those files.Create a new scan from this that loads the column stats with each data file.includeColumnStats(Collection<String> requestedColumns) Create a new scan from this that loads the column stats for the specific columns with each data file.protected FileIOio()booleanReturns whether this scan is case-sensitive with respect to column names.metricsReporter(MetricsReporter reporter) Create a new scan that will report scan metrics to the provided reporter in addition to reporters maintained by the scan.protected abstract ThisTnewRefinedScan(Table newTable, Schema newSchema, org.apache.iceberg.TableScanContext newContext) Create a new scan from this scan's configuration that will override theTable's behavior based on the incoming pair.options()protected ExecutorServicePlan tasks for this scan where each task reads a single file.planWith(ExecutorService executorService) Create a new scan to use a particular executor to plan.Create a new scan from this with the schema as its projection.protected Expressionprotected ScanMetricsschema()Returns this scan's projectionSchema.select(Collection<String> columns) Create a new scan from this that will read the given data columns.protected booleanprotected booleanprotected booleansnapshot()protected LongintReturns the split lookback for this scan.longReturns the split open file cost for this scan.table()protected SchemalongReturns the target split size for this scan.toString()useSnapshot(long scanSnapshotId) protected boolean
-
Field Details
-
SCAN_COLUMNS
-
SCAN_WITH_STATS_COLUMNS
-
DELETE_SCAN_COLUMNS
-
DELETE_SCAN_WITH_STATS_COLUMNS
-
PLAN_SCANS_WITH_WORKER_POOL
protected static final boolean PLAN_SCANS_WITH_WORKER_POOL
-
-
Constructor Details
-
SnapshotScan
-
-
Method Details
-
snapshotId
-
doPlanFiles
-
useSnapshotSchema
protected boolean useSnapshotSchema() -
scanMetrics
-
useSnapshot
-
useRef
-
asOfTime
-
planFiles
Description copied from interface:ScanPlan tasks for this scan where each task reads a single file.Use
Scan.planTasks()for planning balanced tasks where each task will read either a single file, a part of a file, or multiple files.- Returns:
- an Iterable of tasks scanning entire files required by this scan
-
snapshot
-
toString
-
table
-
io
-
tableSchema
-
context
protected org.apache.iceberg.TableScanContext context() -
options
-
scanColumns
-
shouldReturnColumnStats
protected boolean shouldReturnColumnStats() -
columnsToKeepStats
-
shouldIgnoreResiduals
protected boolean shouldIgnoreResiduals() -
residualFilter
-
shouldPlanWithExecutor
protected boolean shouldPlanWithExecutor() -
planExecutor
-
newRefinedScan
-
option
Description copied from interface:ScanCreate a new scan from this scan's configuration that will override theTable's behavior based on the incoming pair. Unknown properties will be ignored.- Specified by:
optionin interfaceScan<ThisT,T extends ScanTask, G extends ScanTaskGroup<T>> - Parameters:
property- name of the table property to be overriddenvalue- value to override with- Returns:
- a new scan based on this with overridden behavior
-
project
Description copied from interface:ScanCreate a new scan from this with the schema as its projection.- Specified by:
projectin interfaceScan<ThisT,T extends ScanTask, G extends ScanTaskGroup<T>> - Parameters:
projectedSchema- a projection schema- Returns:
- a new scan based on this with the given projection
-
caseSensitive
Description copied from interface:ScanCreate a new scan from this that, if data columns where selected viaScan.select(java.util.Collection), controls whether the match to the schema will be done with case sensitivity. Default is true.- Specified by:
caseSensitivein interfaceScan<ThisT,T extends ScanTask, G extends ScanTaskGroup<T>> - Returns:
- a new scan based on this with case sensitivity as stated
-
isCaseSensitive
public boolean isCaseSensitive()Description copied from interface:ScanReturns whether this scan is case-sensitive with respect to column names.- Specified by:
isCaseSensitivein interfaceScan<ThisT,T extends ScanTask, G extends ScanTaskGroup<T>> - Returns:
- true if case-sensitive, false otherwise.
-
includeColumnStats
Description copied from interface:ScanCreate a new scan from this that loads the column stats with each data file.Column stats include: value count, null value count, lower bounds, and upper bounds.
- Specified by:
includeColumnStatsin interfaceScan<ThisT,T extends ScanTask, G extends ScanTaskGroup<T>> - Returns:
- a new scan based on this that loads column stats.
-
includeColumnStats
Description copied from interface:ScanCreate a new scan from this that loads the column stats for the specific columns with each data file.Column stats include: value count, null value count, lower bounds, and upper bounds.
- Specified by:
includeColumnStatsin interfaceScan<ThisT,T extends ScanTask, G extends ScanTaskGroup<T>> - Parameters:
requestedColumns- column names for which to keep the stats.- Returns:
- a new scan based on this that loads column stats for specific columns.
-
select
Description copied from interface:ScanCreate a new scan from this that will read the given data columns. This produces an expected schema that includes all fields that are either selected or used by this scan's filter expression.- Specified by:
selectin interfaceScan<ThisT,T extends ScanTask, G extends ScanTaskGroup<T>> - Parameters:
columns- column names from the table's schema- Returns:
- a new scan based on this with the given projection columns
-
filter
Description copied from interface:ScanCreate a new scan from the results of this filtered by theExpression.- Specified by:
filterin interfaceScan<ThisT,T extends ScanTask, G extends ScanTaskGroup<T>> - Parameters:
expr- a filter expression- Returns:
- a new scan based on this with results filtered by the expression
-
filter
Description copied from interface:ScanReturns this scan's filterExpression.- Specified by:
filterin interfaceScan<ThisT,T extends ScanTask, G extends ScanTaskGroup<T>> - Returns:
- this scan's filter expression
-
ignoreResiduals
Description copied from interface:ScanCreate a new scan from this that applies data filtering to files but not to rows in those files.- Specified by:
ignoreResidualsin interfaceScan<ThisT,T extends ScanTask, G extends ScanTaskGroup<T>> - Returns:
- a new scan based on this that does not filter rows in files.
-
planWith
Description copied from interface:ScanCreate a new scan to use a particular executor to plan. The default worker pool will be used by default.- Specified by:
planWithin interfaceScan<ThisT,T extends ScanTask, G extends ScanTaskGroup<T>> - Parameters:
executorService- the provided executor- Returns:
- a table scan that uses the provided executor to access manifests
-
schema
Description copied from interface:ScanReturns this scan's projectionSchema.If the projection schema was set directly using
Scan.project(Schema), returns that schema.If the projection schema was set by calling
Scan.select(Collection), returns a projection schema that includes the selected data fields and any fields used in the filter expression.- Specified by:
schemain interfaceScan<ThisT,T extends ScanTask, G extends ScanTaskGroup<T>> - Returns:
- this scan's projection schema
-
targetSplitSize
public long targetSplitSize()Description copied from interface:ScanReturns the target split size for this scan.- Specified by:
targetSplitSizein interfaceScan<ThisT,T extends ScanTask, G extends ScanTaskGroup<T>>
-
splitLookback
public int splitLookback()Description copied from interface:ScanReturns the split lookback for this scan.- Specified by:
splitLookbackin interfaceScan<ThisT,T extends ScanTask, G extends ScanTaskGroup<T>>
-
splitOpenFileCost
public long splitOpenFileCost()Description copied from interface:ScanReturns the split open file cost for this scan.- Specified by:
splitOpenFileCostin interfaceScan<ThisT,T extends ScanTask, G extends ScanTaskGroup<T>>
-
metricsReporter
Description copied from interface:ScanCreate a new scan that will report scan metrics to the provided reporter in addition to reporters maintained by the scan.- Specified by:
metricsReporterin interfaceScan<ThisT,T extends ScanTask, G extends ScanTaskGroup<T>>
-