Class RewriteDataFiles.Builder
java.lang.Object
org.apache.iceberg.flink.maintenance.api.MaintenanceTaskBuilder<RewriteDataFiles.Builder>
org.apache.iceberg.flink.maintenance.api.RewriteDataFiles.Builder
- Enclosing class:
RewriteDataFiles
public static class RewriteDataFiles.Builder
extends MaintenanceTaskBuilder<RewriteDataFiles.Builder>
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionconfig
(RewriteDataFilesConfig rewriteDataFilesConfig) Configures the properties for the rewriter.deleteFileThreshold
(int deleteFileThreshold) Configures the minimum delete file number for a file after a rewrite is always initiated.filter
(Expression newFilter) A user provided filter for determining which files will be considered by the rewrite strategy.maxFileGroupSizeBytes
(long maxFileGroupSizeBytes) Configures the group size for rewriting.maxFileSizeBytes
(long maxFileSizeBytes) Configures the max file size considered for rewriting.maxFilesToRewrite
(int maxFilesToRewrite) Configures max files to rewrite.maxRewriteBytes
(long newMaxRewriteBytes) Configures the maximum byte size of the rewrites for one scheduled compaction.minFileSizeBytes
(long minFileSizeBytes) Configures the min file size considered for rewriting.minInputFiles
(int minInputFiles) Configures the minimum file number after a rewrite is always initiated.partialProgressEnabled
(boolean newPartialProgressEnabled) Allows committing compacted data files in batches.partialProgressMaxCommits
(int newPartialProgressMaxCommits) Configures the size of batches ifpartialProgressEnabled
.rewriteAll
(boolean rewriteAll) Overrides other options and forces rewriting of all provided files.targetFileSizeBytes
(long targetFileSizeBytes) Configures the target file size.Methods inherited from class org.apache.iceberg.flink.maintenance.api.MaintenanceTaskBuilder
index, operatorName, parallelism, parallelism, scheduleOnCommitCount, scheduleOnDataFileCount, scheduleOnDataFileSize, scheduleOnEqDeleteFileCount, scheduleOnEqDeleteRecordCount, scheduleOnInterval, scheduleOnPosDeleteFileCount, scheduleOnPosDeleteRecordCount, slotSharingGroup, slotSharingGroup, tableLoader, tableName, taskName, uidSuffix, uidSuffix
-
Constructor Details
-
Builder
public Builder()
-
-
Method Details
-
partialProgressEnabled
Allows committing compacted data files in batches. SeeRewriteDataFiles.PARTIAL_PROGRESS_ENABLED
for more details.- Parameters:
newPartialProgressEnabled
- to enable partial commits
-
partialProgressMaxCommits
Configures the size of batches ifpartialProgressEnabled
. SeeRewriteDataFiles.PARTIAL_PROGRESS_MAX_COMMITS
for more details.- Parameters:
newPartialProgressMaxCommits
- to target number of the commits per run
-
maxRewriteBytes
Configures the maximum byte size of the rewrites for one scheduled compaction. This could be used to limit the resources used by the compaction.- Parameters:
newMaxRewriteBytes
- to limit the size of the rewrites
-
targetFileSizeBytes
Configures the target file size. SeeRewriteDataFiles.TARGET_FILE_SIZE_BYTES
for more details.- Parameters:
targetFileSizeBytes
- target file size
-
minFileSizeBytes
Configures the min file size considered for rewriting. SeeSizeBasedFileRewritePlanner.MIN_FILE_SIZE_BYTES
for more details.- Parameters:
minFileSizeBytes
- min file size
-
maxFileSizeBytes
Configures the max file size considered for rewriting. SeeSizeBasedFileRewritePlanner.MAX_FILE_SIZE_BYTES
for more details.- Parameters:
maxFileSizeBytes
- max file size
-
minInputFiles
Configures the minimum file number after a rewrite is always initiated. See description seeSizeBasedFileRewritePlanner.MIN_INPUT_FILES
for more details.- Parameters:
minInputFiles
- min file number
-
deleteFileThreshold
Configures the minimum delete file number for a file after a rewrite is always initiated. SeeBinPackRewriteFilePlanner.DELETE_FILE_THRESHOLD
for more details.- Parameters:
deleteFileThreshold
- min delete file number
-
rewriteAll
Overrides other options and forces rewriting of all provided files.- Parameters:
rewriteAll
- enables a full rewrite
-
maxFileGroupSizeBytes
Configures the group size for rewriting. SeeSizeBasedFileRewritePlanner.MAX_FILE_GROUP_SIZE_BYTES
for more details.- Parameters:
maxFileGroupSizeBytes
- file group size for rewrite
-
maxFilesToRewrite
Configures max files to rewrite. SeeBinPackRewriteFilePlanner.MAX_FILES_TO_REWRITE
for more details.- Parameters:
maxFilesToRewrite
- maximum files to rewrite
-
filter
A user provided filter for determining which files will be considered by the rewrite strategy.- Parameters:
newFilter
- the filter expression to apply- Returns:
- this for method chaining
-
config
Configures the properties for the rewriter.- Parameters:
rewriteDataFilesConfig
- properties for the rewriter
-