Package org.apache.iceberg.actions
Interface RewriteStrategy
- 
- All Superinterfaces:
 java.io.Serializable
- All Known Implementing Classes:
 BinPackStrategy,SortStrategy,Spark3BinPackStrategy
public interface RewriteStrategy extends java.io.Serializable 
- 
- 
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description java.lang.Stringname()Returns the name of this rewrite strategyRewriteStrategyoptions(java.util.Map<java.lang.String,java.lang.String> options)Sets options to be used with this strategyjava.lang.Iterable<java.util.List<FileScanTask>>planFileGroups(java.lang.Iterable<FileScanTask> dataFiles)Groups file scans into lists which will be processed in a single executable unit.java.util.Set<DataFile>rewriteFiles(java.util.List<FileScanTask> filesToRewrite)Method which will rewrite files based on this particular RewriteStrategy's algorithm.java.lang.Iterable<FileScanTask>selectFilesToRewrite(java.lang.Iterable<FileScanTask> dataFiles)Selects files which this strategy believes are valid targets to be rewritten.Tabletable()Returns the table being modified by this rewrite strategyjava.util.Set<java.lang.String>validOptions()Returns a set of options which this rewrite strategy can use. 
 - 
 
- 
- 
Method Detail
- 
name
java.lang.String name()
Returns the name of this rewrite strategy 
- 
table
Table table()
Returns the table being modified by this rewrite strategy 
- 
validOptions
java.util.Set<java.lang.String> validOptions()
Returns a set of options which this rewrite strategy can use. This is an allowed-list and any options not specified here will be rejected at runtime. 
- 
options
RewriteStrategy options(java.util.Map<java.lang.String,java.lang.String> options)
Sets options to be used with this strategy 
- 
selectFilesToRewrite
java.lang.Iterable<FileScanTask> selectFilesToRewrite(java.lang.Iterable<FileScanTask> dataFiles)
Selects files which this strategy believes are valid targets to be rewritten.- Parameters:
 dataFiles- iterable of FileScanTasks for files in a given partition- Returns:
 - iterable containing only FileScanTasks to be rewritten
 
 
- 
planFileGroups
java.lang.Iterable<java.util.List<FileScanTask>> planFileGroups(java.lang.Iterable<FileScanTask> dataFiles)
Groups file scans into lists which will be processed in a single executable unit. Each group will end up being committed as an independent set of changes. This creates the jobs which will eventually be run as by the underlying Action.- Parameters:
 dataFiles- iterable of FileScanTasks to be rewritten- Returns:
 - iterable of lists of FileScanTasks which will be processed together
 
 
- 
rewriteFiles
java.util.Set<DataFile> rewriteFiles(java.util.List<FileScanTask> filesToRewrite)
Method which will rewrite files based on this particular RewriteStrategy's algorithm. This will most likely be Action framework specific (Spark/Presto/Flink ....).- Parameters:
 filesToRewrite- a group of files to be rewritten together- Returns:
 - a set of newly written files
 
 
 - 
 
 -