RewritePositionDeleteFilesSparkAction

java.lang.Object
- org.apache.iceberg.spark.actions.RewritePositionDeleteFilesSparkAction

All Implemented Interfaces:

Action<RewritePositionDeleteFiles,RewritePositionDeleteFiles.Result>, RewritePositionDeleteFiles, SnapshotUpdate<RewritePositionDeleteFiles,RewritePositionDeleteFiles.Result>
```
public class RewritePositionDeleteFilesSparkAction
extends java.lang.Object
implements RewritePositionDeleteFiles
```
Spark implementation of RewritePositionDeleteFiles.

Nested Class Summary
- Nested classes/interfaces inherited from interface org.apache.iceberg.actions.RewritePositionDeleteFiles
  RewritePositionDeleteFiles.FileGroupInfo, RewritePositionDeleteFiles.FileGroupRewriteResult, RewritePositionDeleteFiles.Result

Field Summary

Fields
Modifier and Type	Field and Description
`protected static org.apache.iceberg.relocated.com.google.common.base.Joiner`	`COMMA_JOINER`
`protected static org.apache.iceberg.relocated.com.google.common.base.Splitter`	`COMMA_SPLITTER`
`protected static java.lang.String`	`FILE_PATH`
`protected static java.lang.String`	`LAST_MODIFIED`
`protected static java.lang.String`	`MANIFEST`
`protected static java.lang.String`	`MANIFEST_LIST`
`protected static java.lang.String`	`OTHERS`
`protected static java.lang.String`	`STATISTICS_FILES`

Fields inherited from interface org.apache.iceberg.actions.RewritePositionDeleteFiles
MAX_CONCURRENT_FILE_GROUP_REWRITES, MAX_CONCURRENT_FILE_GROUP_REWRITES_DEFAULT, PARTIAL_PROGRESS_ENABLED, PARTIAL_PROGRESS_ENABLED_DEFAULT, PARTIAL_PROGRESS_MAX_COMMITS, PARTIAL_PROGRESS_MAX_COMMITS_DEFAULT, REWRITE_JOB_ORDER, REWRITE_JOB_ORDER_DEFAULT

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`protected org.apache.spark.sql.Dataset<FileInfo>`	`allReachableOtherMetadataFileDS(Table table)`
`protected void`	`commit(SnapshotUpdate<?> update)`
`protected java.util.Map<java.lang.String,java.lang.String>`	`commitSummary()`
`protected org.apache.spark.sql.Dataset<FileInfo>`	`contentFileDS(Table table)`
`protected org.apache.spark.sql.Dataset<FileInfo>`	`contentFileDS(Table table, java.util.Set<java.lang.Long> snapshotIds)`
`protected org.apache.iceberg.spark.actions.BaseSparkAction.DeleteSummary`	`deleteFiles(java.util.concurrent.ExecutorService executorService, java.util.function.Consumer<java.lang.String> deleteFunc, java.util.Iterator<FileInfo> files)` Deletes files and keeps track of how many files were removed for each file type.
`protected org.apache.iceberg.spark.actions.BaseSparkAction.DeleteSummary`	`deleteFiles(SupportsBulkOperations io, java.util.Iterator<FileInfo> files)`
`RewritePositionDeleteFiles.Result`	`execute()` Executes this action.
`RewritePositionDeleteFilesSparkAction`	`filter(Expression expression)` A filter for finding deletes to rewrite.
`protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>`	`loadMetadataTable(Table table, MetadataTableType type)`
`protected org.apache.spark.sql.Dataset<FileInfo>`	`manifestDS(Table table)`
`protected org.apache.spark.sql.Dataset<FileInfo>`	`manifestDS(Table table, java.util.Set<java.lang.Long> snapshotIds)`
`protected org.apache.spark.sql.Dataset<FileInfo>`	`manifestListDS(Table table)`
`protected org.apache.spark.sql.Dataset<FileInfo>`	`manifestListDS(Table table, java.util.Set<java.lang.Long> snapshotIds)`
`protected JobGroupInfo`	`newJobGroupInfo(java.lang.String groupId, java.lang.String desc)`
`protected Table`	`newStaticTable(TableMetadata metadata, FileIO io)`
`ThisT`	`option(java.lang.String name, java.lang.String value)`
`protected java.util.Map<java.lang.String,java.lang.String>`	`options()`
`ThisT`	`options(java.util.Map<java.lang.String,java.lang.String> newOptions)`
`protected org.apache.spark.sql.Dataset<FileInfo>`	`otherMetadataFileDS(Table table)`
`protected RewritePositionDeleteFilesSparkAction`	`self()`
`ThisT`	`snapshotProperty(java.lang.String property, java.lang.String value)`
`protected org.apache.spark.sql.SparkSession`	`spark()`
`protected org.apache.spark.api.java.JavaSparkContext`	`sparkContext()`
`protected org.apache.spark.sql.Dataset<FileInfo>`	`statisticsFileDS(Table table, java.util.Set<java.lang.Long> snapshotIds)`
`protected <T> T`	`withJobGroupInfo(JobGroupInfo info, java.util.function.Supplier<T> supplier)`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface org.apache.iceberg.actions.SnapshotUpdate
snapshotProperty

Methods inherited from interface org.apache.iceberg.actions.Action
option, options

Field Detail

MANIFEST

protected static final java.lang.String MANIFEST

See Also:: Constant Field Values

MANIFEST_LIST

protected static final java.lang.String MANIFEST_LIST

See Also:: Constant Field Values

STATISTICS_FILES

protected static final java.lang.String STATISTICS_FILES

See Also:: Constant Field Values

OTHERS

protected static final java.lang.String OTHERS

See Also:: Constant Field Values

FILE_PATH

protected static final java.lang.String FILE_PATH

See Also:: Constant Field Values

LAST_MODIFIED

protected static final java.lang.String LAST_MODIFIED

See Also:: Constant Field Values

COMMA_SPLITTER

protected static final org.apache.iceberg.relocated.com.google.common.base.Splitter COMMA_SPLITTER

COMMA_JOINER

protected static final org.apache.iceberg.relocated.com.google.common.base.Joiner COMMA_JOINER

Method Detail

self

protected RewritePositionDeleteFilesSparkAction self()

filter
```
public RewritePositionDeleteFilesSparkAction filter(Expression expression)
```
Description copied from interface: RewritePositionDeleteFiles

A filter for finding deletes to rewrite.
The filter will be converted to a partition filter with an inclusive projection. Any file that may contain rows matching this filter will be used by the action. The matching delete files will be rewritten.

Specified by:

filter in interface RewritePositionDeleteFiles

Parameters:

expression - An iceberg expression used to find deletes.

Returns:

this for method chaining

execute
```
public RewritePositionDeleteFiles.Result execute()
```
Description copied from interface: Action

Executes this action.

Specified by:

execute in interface Action<RewritePositionDeleteFiles,RewritePositionDeleteFiles.Result>

Returns:

the result of this action

snapshotProperty

public ThisT snapshotProperty(java.lang.String property,
                              java.lang.String value)

commit

protected void commit(SnapshotUpdate<?> update)

commitSummary

protected java.util.Map<java.lang.String,java.lang.String> commitSummary()

spark

protected org.apache.spark.sql.SparkSession spark()

sparkContext

protected org.apache.spark.api.java.JavaSparkContext sparkContext()

option

public ThisT option(java.lang.String name,
                    java.lang.String value)

options

public ThisT options(java.util.Map<java.lang.String,java.lang.String> newOptions)

options

protected java.util.Map<java.lang.String,java.lang.String> options()

withJobGroupInfo

protected <T> T withJobGroupInfo(JobGroupInfo info,
                                 java.util.function.Supplier<T> supplier)

newJobGroupInfo

protected JobGroupInfo newJobGroupInfo(java.lang.String groupId,
                                       java.lang.String desc)

newStaticTable

protected Table newStaticTable(TableMetadata metadata,
                               FileIO io)

contentFileDS

protected org.apache.spark.sql.Dataset<FileInfo> contentFileDS(Table table)

contentFileDS

protected org.apache.spark.sql.Dataset<FileInfo> contentFileDS(Table table,
                                                               java.util.Set<java.lang.Long> snapshotIds)

manifestDS

protected org.apache.spark.sql.Dataset<FileInfo> manifestDS(Table table)

manifestDS

protected org.apache.spark.sql.Dataset<FileInfo> manifestDS(Table table,
                                                            java.util.Set<java.lang.Long> snapshotIds)

manifestListDS

protected org.apache.spark.sql.Dataset<FileInfo> manifestListDS(Table table)

manifestListDS

protected org.apache.spark.sql.Dataset<FileInfo> manifestListDS(Table table,
                                                                java.util.Set<java.lang.Long> snapshotIds)

statisticsFileDS

protected org.apache.spark.sql.Dataset<FileInfo> statisticsFileDS(Table table,
                                                                  java.util.Set<java.lang.Long> snapshotIds)

otherMetadataFileDS

protected org.apache.spark.sql.Dataset<FileInfo> otherMetadataFileDS(Table table)

allReachableOtherMetadataFileDS

protected org.apache.spark.sql.Dataset<FileInfo> allReachableOtherMetadataFileDS(Table table)

loadMetadataTable

protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> loadMetadataTable(Table table,
                                                                                   MetadataTableType type)

deleteFiles

protected org.apache.iceberg.spark.actions.BaseSparkAction.DeleteSummary deleteFiles(java.util.concurrent.ExecutorService executorService,
                                                                                     java.util.function.Consumer<java.lang.String> deleteFunc,
                                                                                     java.util.Iterator<FileInfo> files)

Deletes files and keeps track of how many files were removed for each file type.

Parameters:: executorService - an executor service to use for parallel deletes; deleteFunc - a delete func; files - an iterator of Spark rows of the structure (path: String, type: String)
Returns:: stats on which files were deleted

deleteFiles

protected org.apache.iceberg.spark.actions.BaseSparkAction.DeleteSummary deleteFiles(SupportsBulkOperations io,
                                                                                     java.util.Iterator<FileInfo> files)

Class RewritePositionDeleteFilesSparkAction

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.iceberg.actions.RewritePositionDeleteFiles

Field Summary

Fields inherited from interface org.apache.iceberg.actions.RewritePositionDeleteFiles

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface org.apache.iceberg.actions.SnapshotUpdate

Methods inherited from interface org.apache.iceberg.actions.Action

Field Detail

MANIFEST

MANIFEST_LIST

STATISTICS_FILES

OTHERS

FILE_PATH

LAST_MODIFIED

COMMA_SPLITTER

COMMA_JOINER

Method Detail

self

filter

execute

snapshotProperty

commit

commitSummary

spark

sparkContext

option

options

options

withJobGroupInfo

newJobGroupInfo

newStaticTable

contentFileDS

contentFileDS

manifestDS

manifestDS

manifestListDS

manifestListDS

statisticsFileDS

otherMetadataFileDS

allReachableOtherMetadataFileDS

loadMetadataTable

deleteFiles

deleteFiles