Package org.apache.iceberg
Class BaseFileScanTask
java.lang.Object
org.apache.iceberg.BaseFileScanTask
- All Implemented Interfaces:
Serializable,ContentScanTask<DataFile>,FileScanTask,PartitionScanTask,ScanTask,SplittableScanTask<FileScanTask>
- See Also:
-
Constructor Summary
ConstructorsConstructorDescriptionBaseFileScanTask(DataFile file, DeleteFile[] deletes, String schemaString, String specString, ResidualEvaluator residuals) -
Method Summary
Modifier and TypeMethodDescriptiondeletes()A list ofdelete filesto apply when reading the task's data file.longThe estimated number of rows produced by this scan task.file()Thefileto scan.intThe number of files that will be opened by this scan task.longlength()The number of bytes to scan from theContentScanTask.start()position in the file.protected FileScanTasknewSplitTask(FileScanTask parentTask, long offset, long length) residual()Returns the residual expression that should be applied to rows in this file scan.schema()Return the schema for this file scan task.protected FileScanTaskself()longThe number of bytes that should be read by this scan task.spec()Returns the spec of the partition for this scan tasksplit(long targetSplitSize) Attempts to split this scan task into several smaller scan tasks, each close tosplitSizesize.longstart()The starting position of this scan range in the file.toString()Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitMethods inherited from interface org.apache.iceberg.ContentScanTask
estimatedRowsCount, file, length, partition, residual, startMethods inherited from interface org.apache.iceberg.FileScanTask
asFileScanTask, isFileScanTaskMethods inherited from interface org.apache.iceberg.PartitionScanTask
specMethods inherited from interface org.apache.iceberg.ScanTask
asCombinedScanTask, asDataTask, isDataTaskMethods inherited from interface org.apache.iceberg.SplittableScanTask
split
-
Constructor Details
-
BaseFileScanTask
public BaseFileScanTask(DataFile file, DeleteFile[] deletes, String schemaString, String specString, ResidualEvaluator residuals)
-
-
Method Details
-
self
-
newSplitTask
-
deletes
Description copied from interface:FileScanTaskA list ofdelete filesto apply when reading the task's data file.- Specified by:
deletesin interfaceFileScanTask- Returns:
- a list of delete files to apply
-
sizeBytes
public long sizeBytes()Description copied from interface:ScanTaskThe number of bytes that should be read by this scan task.- Specified by:
sizeBytesin interfaceContentScanTask<DataFile>- Specified by:
sizeBytesin interfaceFileScanTask- Specified by:
sizeBytesin interfaceScanTask- Returns:
- the total number of bytes to read
-
filesCount
public int filesCount()Description copied from interface:ScanTaskThe number of files that will be opened by this scan task.- Specified by:
filesCountin interfaceFileScanTask- Specified by:
filesCountin interfaceScanTask- Returns:
- the number of files to open
-
schema
Description copied from interface:FileScanTaskReturn the schema for this file scan task.- Specified by:
schemain interfaceFileScanTask
-
file
Description copied from interface:ContentScanTaskThefileto scan.- Specified by:
filein interfaceContentScanTask<ThisT extends ContentScanTask<F>>- Returns:
- the file to scan
-
spec
Description copied from interface:PartitionScanTaskReturns the spec of the partition for this scan task- Specified by:
specin interfacePartitionScanTask
-
start
public long start()Description copied from interface:ContentScanTaskThe starting position of this scan range in the file.- Specified by:
startin interfaceContentScanTask<ThisT extends ContentScanTask<F>>- Returns:
- the start position of this scan range
-
length
public long length()Description copied from interface:ContentScanTaskThe number of bytes to scan from theContentScanTask.start()position in the file.- Specified by:
lengthin interfaceContentScanTask<ThisT extends ContentScanTask<F>>- Returns:
- the length of this scan range in bytes
-
residual
Description copied from interface:ContentScanTaskReturns the residual expression that should be applied to rows in this file scan.The residual expression for a file is a filter expression created by partially evaluating the scan's filter using the file's partition data.
- Specified by:
residualin interfaceContentScanTask<ThisT extends ContentScanTask<F>>- Returns:
- a residual expression to apply to rows from this scan
-
estimatedRowsCount
public long estimatedRowsCount()Description copied from interface:ScanTaskThe estimated number of rows produced by this scan task.- Specified by:
estimatedRowsCountin interfaceContentScanTask<ThisT extends ContentScanTask<F>>- Specified by:
estimatedRowsCountin interfaceScanTask- Returns:
- the estimated number of produced rows
-
split
Description copied from interface:SplittableScanTaskAttempts to split this scan task into several smaller scan tasks, each close tosplitSizesize.Note the target split size is just guidance and the actual split size may be either smaller or larger. File formats like Parquet may leverage the row group offset information while splitting tasks.
- Specified by:
splitin interfaceSplittableScanTask<ThisT extends ContentScanTask<F>>- Parameters:
targetSplitSize- the target size of each new scan task in bytes- Returns:
- an Iterable of smaller tasks
-
toString
-