Package org.apache.iceberg.hadoop
Class HadoopFileIO
java.lang.Object
org.apache.iceberg.hadoop.HadoopFileIO
- All Implemented Interfaces:
Closeable,Serializable,AutoCloseable,org.apache.hadoop.conf.Configurable,HadoopConfigurable,DelegateFileIO,FileIO,SupportsBulkOperations,SupportsPrefixOperations
- See Also:
-
Constructor Summary
ConstructorsConstructorDescriptionConstructor used for dynamic FileIO loading.HadoopFileIO(org.apache.hadoop.conf.Configuration hadoopConf) HadoopFileIO(SerializableSupplier<org.apache.hadoop.conf.Configuration> hadoopConf) -
Method Summary
Modifier and TypeMethodDescriptionorg.apache.hadoop.conf.Configurationconf()voiddeleteFile(String path) Delete the file at the given path.voiddeleteFiles(Iterable<String> pathsToDelete) Delete the files at the given paths.voiddeletePrefix(String prefix) Delete all files under a prefix.org.apache.hadoop.conf.ConfigurationgetConf()voidinitialize(Map<String, String> props) Initialize File IO from catalog properties.listPrefix(String prefix) Return an iterable of all files under a prefix.newInputFile(String path) Get aInputFileinstance to read bytes from the file at the given path.newInputFile(String path, long length) Get aInputFileinstance to read bytes from the file at the given path, with a known file length.newOutputFile(String path) Get aOutputFileinstance to write bytes to the file at the given path.Returns the property map used to configure this FileIOvoidserializeConfWith(Function<org.apache.hadoop.conf.Configuration, SerializableSupplier<org.apache.hadoop.conf.Configuration>> confSerializer) Take a function that serializes Hadoop configuration into a supplier.voidsetConf(org.apache.hadoop.conf.Configuration conf) Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.iceberg.io.FileIO
close, deleteFile, deleteFile, newInputFile, newInputFile, newInputFile
-
Constructor Details
-
HadoopFileIO
public HadoopFileIO()Constructor used for dynamic FileIO loading.Hadoop configurationmust be set throughsetConf(Configuration) -
HadoopFileIO
public HadoopFileIO(org.apache.hadoop.conf.Configuration hadoopConf) -
HadoopFileIO
-
-
Method Details
-
conf
public org.apache.hadoop.conf.Configuration conf() -
initialize
Description copied from interface:FileIOInitialize File IO from catalog properties.- Specified by:
initializein interfaceFileIO- Parameters:
props- catalog properties
-
newInputFile
Description copied from interface:FileIOGet aInputFileinstance to read bytes from the file at the given path.- Specified by:
newInputFilein interfaceFileIO
-
newInputFile
Description copied from interface:FileIOGet aInputFileinstance to read bytes from the file at the given path, with a known file length.- Specified by:
newInputFilein interfaceFileIO
-
newOutputFile
Description copied from interface:FileIOGet aOutputFileinstance to write bytes to the file at the given path.- Specified by:
newOutputFilein interfaceFileIO
-
deleteFile
Description copied from interface:FileIODelete the file at the given path.- Specified by:
deleteFilein interfaceFileIO
-
properties
Description copied from interface:FileIOReturns the property map used to configure this FileIO- Specified by:
propertiesin interfaceFileIO
-
setConf
public void setConf(org.apache.hadoop.conf.Configuration conf) - Specified by:
setConfin interfaceorg.apache.hadoop.conf.Configurable
-
getConf
public org.apache.hadoop.conf.Configuration getConf()- Specified by:
getConfin interfaceorg.apache.hadoop.conf.Configurable
-
serializeConfWith
public void serializeConfWith(Function<org.apache.hadoop.conf.Configuration, SerializableSupplier<org.apache.hadoop.conf.Configuration>> confSerializer) Description copied from interface:HadoopConfigurableTake a function that serializes Hadoop configuration into a supplier. An implementation is supposed to pass in its current Hadoop configuration into this function, and the result can be safely serialized for future use.- Specified by:
serializeConfWithin interfaceHadoopConfigurable- Parameters:
confSerializer- A function that takes Hadoop configuration and returns a serializable supplier of it.
-
listPrefix
Description copied from interface:SupportsPrefixOperationsReturn an iterable of all files under a prefix.Hierarchical file systems (e.g. HDFS) may impose additional restrictions like the prefix must fully match a directory whereas key/value object stores may allow for arbitrary prefixes.
- Specified by:
listPrefixin interfaceSupportsPrefixOperations- Parameters:
prefix- prefix to list- Returns:
- iterable of file information
-
deletePrefix
Description copied from interface:SupportsPrefixOperationsDelete all files under a prefix.Hierarchical file systems (e.g. HDFS) may impose additional restrictions like the prefix must fully match a directory whereas key/value object stores may allow for arbitrary prefixes.
- Specified by:
deletePrefixin interfaceSupportsPrefixOperations- Parameters:
prefix- prefix to delete
-
deleteFiles
Description copied from interface:SupportsBulkOperationsDelete the files at the given paths.- Specified by:
deleteFilesin interfaceSupportsBulkOperations- Parameters:
pathsToDelete- The paths to delete- Throws:
BulkDeletionFailureException- in case of failure to delete at least 1 file
-