Package org.apache.iceberg.spark
Class SparkCatalog
java.lang.Object
org.apache.iceberg.spark.SparkCatalog
- All Implemented Interfaces:
HasIcebergCatalog
,SupportsReplaceView
,org.apache.spark.sql.connector.catalog.CatalogPlugin
,org.apache.spark.sql.connector.catalog.FunctionCatalog
,org.apache.spark.sql.connector.catalog.StagingTableCatalog
,org.apache.spark.sql.connector.catalog.SupportsNamespaces
,org.apache.spark.sql.connector.catalog.TableCatalog
,org.apache.spark.sql.connector.catalog.ViewCatalog
,ProcedureCatalog
A Spark TableCatalog implementation that wraps an Iceberg
Catalog
.
This supports the following catalog configuration options:
type
- catalog type, "hive" or "hadoop" or "rest". To specify a non-hive or hadoop catalog, use thecatalog-impl
option.uri
- the Hive Metastore URI for Hive catalog or REST URI for REST catalogwarehouse
- the warehouse path (Hadoop catalog only)catalog-impl
- a customCatalog
implementation to useio-impl
- a customFileIO
implementation to usemetrics-reporter-impl
- a customMetricsReporter
implementation to usedefault-namespace
- a namespace to use as the defaultcache-enabled
- whether to enable catalog cachecache.case-sensitive
- whether the catalog cache should compare table identifiers in a case sensitive waycache.expiration-interval-ms
- interval in millis before expiring tables from catalog cache. Refer toCatalogProperties.CACHE_EXPIRATION_INTERVAL_MS
for further details and significant values.table-default.$tablePropertyKey
- table property $tablePropertyKey default at catalog leveltable-override.$tablePropertyKey
- table property $tablePropertyKey enforced at catalog level
-
Field Summary
Fields inherited from interface org.apache.spark.sql.connector.catalog.SupportsNamespaces
PROP_COMMENT, PROP_LOCATION, PROP_OWNER
Fields inherited from interface org.apache.spark.sql.connector.catalog.TableCatalog
OPTION_PREFIX, PROP_COMMENT, PROP_EXTERNAL, PROP_IS_MANAGED_LOCATION, PROP_LOCATION, PROP_OWNER, PROP_PROVIDER
Fields inherited from interface org.apache.spark.sql.connector.catalog.ViewCatalog
PROP_COMMENT, PROP_CREATE_ENGINE_VERSION, PROP_ENGINE_VERSION, PROP_OWNER, RESERVED_PROPERTIES
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
alterNamespace
(String[] namespace, org.apache.spark.sql.connector.catalog.NamespaceChange... changes) org.apache.spark.sql.connector.catalog.Table
alterTable
(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.connector.catalog.TableChange... changes) org.apache.spark.sql.connector.catalog.View
alterView
(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.connector.catalog.ViewChange... changes) protected Catalog
buildIcebergCatalog
(String name, org.apache.spark.sql.util.CaseInsensitiveStringMap options) Build an IcebergCatalog
to be used by this Spark catalog adapter.protected TableIdentifier
buildIdentifier
(org.apache.spark.sql.connector.catalog.Identifier identifier) Build an IcebergTableIdentifier
for the given Spark identifier.void
createNamespace
(String[] namespace, Map<String, String> metadata) org.apache.spark.sql.connector.catalog.Table
createTable
(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, Map<String, String> properties) org.apache.spark.sql.connector.catalog.View
createView
(org.apache.spark.sql.connector.catalog.Identifier ident, String sql, String currentCatalog, String[] currentNamespace, org.apache.spark.sql.types.StructType schema, String[] queryColumnNames, String[] columnAliases, String[] columnComments, Map<String, String> properties) String[]
boolean
dropNamespace
(String[] namespace, boolean cascade) boolean
dropTable
(org.apache.spark.sql.connector.catalog.Identifier ident) boolean
dropView
(org.apache.spark.sql.connector.catalog.Identifier ident) Returns the underlyingCatalog
backing this Spark Catalogfinal void
initialize
(String name, org.apache.spark.sql.util.CaseInsensitiveStringMap options) void
invalidateTable
(org.apache.spark.sql.connector.catalog.Identifier ident) boolean
isExistingNamespace
(String[] namespace) boolean
isFunctionNamespace
(String[] namespace) default org.apache.spark.sql.connector.catalog.Identifier[]
listFunctions
(String[] namespace) String[][]
String[][]
listNamespaces
(String[] namespace) org.apache.spark.sql.connector.catalog.Identifier[]
listTables
(String[] namespace) org.apache.spark.sql.connector.catalog.Identifier[]
default org.apache.spark.sql.connector.catalog.functions.UnboundFunction
loadFunction
(org.apache.spark.sql.connector.catalog.Identifier ident) loadNamespaceMetadata
(String[] namespace) loadProcedure
(org.apache.spark.sql.connector.catalog.Identifier ident) Load astored procedure
byidentifier
.org.apache.spark.sql.connector.catalog.Table
loadTable
(org.apache.spark.sql.connector.catalog.Identifier ident) org.apache.spark.sql.connector.catalog.Table
loadTable
(org.apache.spark.sql.connector.catalog.Identifier ident, long timestamp) org.apache.spark.sql.connector.catalog.Table
org.apache.spark.sql.connector.catalog.View
loadView
(org.apache.spark.sql.connector.catalog.Identifier ident) name()
boolean
purgeTable
(org.apache.spark.sql.connector.catalog.Identifier ident) void
renameTable
(org.apache.spark.sql.connector.catalog.Identifier from, org.apache.spark.sql.connector.catalog.Identifier to) void
renameView
(org.apache.spark.sql.connector.catalog.Identifier fromIdentifier, org.apache.spark.sql.connector.catalog.Identifier toIdentifier) org.apache.spark.sql.connector.catalog.View
replaceView
(org.apache.spark.sql.connector.catalog.Identifier ident, String sql, String currentCatalog, String[] currentNamespace, org.apache.spark.sql.types.StructType schema, String[] queryColumnNames, String[] columnAliases, String[] columnComments, Map<String, String> properties) Replace a view in the catalogorg.apache.spark.sql.connector.catalog.StagedTable
stageCreate
(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, Map<String, String> properties) org.apache.spark.sql.connector.catalog.StagedTable
stageCreateOrReplace
(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, Map<String, String> properties) org.apache.spark.sql.connector.catalog.StagedTable
stageReplace
(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, Map<String, String> properties) boolean
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.apache.spark.sql.connector.catalog.FunctionCatalog
functionExists
Methods inherited from interface org.apache.spark.sql.connector.catalog.StagingTableCatalog
stageCreate, stageCreateOrReplace, stageReplace
Methods inherited from interface org.apache.spark.sql.connector.catalog.SupportsNamespaces
namespaceExists
Methods inherited from interface org.apache.spark.sql.connector.catalog.TableCatalog
capabilities, createTable, loadTable, tableExists
Methods inherited from interface org.apache.spark.sql.connector.catalog.ViewCatalog
invalidateView, viewExists
-
Constructor Details
-
SparkCatalog
public SparkCatalog()
-
-
Method Details
-
buildIcebergCatalog
protected Catalog buildIcebergCatalog(String name, org.apache.spark.sql.util.CaseInsensitiveStringMap options) Build an IcebergCatalog
to be used by this Spark catalog adapter.- Parameters:
name
- Spark's catalog nameoptions
- Spark's catalog options- Returns:
- an Iceberg catalog
-
buildIdentifier
protected TableIdentifier buildIdentifier(org.apache.spark.sql.connector.catalog.Identifier identifier) Build an IcebergTableIdentifier
for the given Spark identifier.- Parameters:
identifier
- Spark's identifier- Returns:
- an Iceberg identifier
-
loadTable
public org.apache.spark.sql.connector.catalog.Table loadTable(org.apache.spark.sql.connector.catalog.Identifier ident) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException - Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException
-
loadTable
public org.apache.spark.sql.connector.catalog.Table loadTable(org.apache.spark.sql.connector.catalog.Identifier ident, String version) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException - Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException
-
loadTable
public org.apache.spark.sql.connector.catalog.Table loadTable(org.apache.spark.sql.connector.catalog.Identifier ident, long timestamp) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException - Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException
-
createTable
public org.apache.spark.sql.connector.catalog.Table createTable(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, Map<String, String> properties) throws org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException- Throws:
org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
-
stageCreate
public org.apache.spark.sql.connector.catalog.StagedTable stageCreate(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, Map<String, String> properties) throws org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException- Throws:
org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
-
stageReplace
public org.apache.spark.sql.connector.catalog.StagedTable stageReplace(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.types.StructType schema, org.apache.spark.sql.connector.expressions.Transform[] transforms, Map<String, String> properties) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException
-
stageCreateOrReplace
-
alterTable
public org.apache.spark.sql.connector.catalog.Table alterTable(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.connector.catalog.TableChange... changes) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException - Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException
-
dropTable
public boolean dropTable(org.apache.spark.sql.connector.catalog.Identifier ident) -
purgeTable
public boolean purgeTable(org.apache.spark.sql.connector.catalog.Identifier ident) -
renameTable
public void renameTable(org.apache.spark.sql.connector.catalog.Identifier from, org.apache.spark.sql.connector.catalog.Identifier to) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException, org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException - Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException
org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
-
invalidateTable
public void invalidateTable(org.apache.spark.sql.connector.catalog.Identifier ident) -
listTables
-
defaultNamespace
-
listNamespaces
-
listNamespaces
public String[][] listNamespaces(String[] namespace) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException - Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
-
loadNamespaceMetadata
public Map<String,String> loadNamespaceMetadata(String[] namespace) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException - Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
-
createNamespace
public void createNamespace(String[] namespace, Map<String, String> metadata) throws org.apache.spark.sql.catalyst.analysis.NamespaceAlreadyExistsException- Throws:
org.apache.spark.sql.catalyst.analysis.NamespaceAlreadyExistsException
-
alterNamespace
public void alterNamespace(String[] namespace, org.apache.spark.sql.connector.catalog.NamespaceChange... changes) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException - Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
-
dropNamespace
public boolean dropNamespace(String[] namespace, boolean cascade) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException - Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
-
listViews
-
loadView
public org.apache.spark.sql.connector.catalog.View loadView(org.apache.spark.sql.connector.catalog.Identifier ident) throws org.apache.spark.sql.catalyst.analysis.NoSuchViewException - Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchViewException
-
createView
public org.apache.spark.sql.connector.catalog.View createView(org.apache.spark.sql.connector.catalog.Identifier ident, String sql, String currentCatalog, String[] currentNamespace, org.apache.spark.sql.types.StructType schema, String[] queryColumnNames, String[] columnAliases, String[] columnComments, Map<String, String> properties) throws org.apache.spark.sql.catalyst.analysis.ViewAlreadyExistsException, org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException- Throws:
org.apache.spark.sql.catalyst.analysis.ViewAlreadyExistsException
org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
-
replaceView
public org.apache.spark.sql.connector.catalog.View replaceView(org.apache.spark.sql.connector.catalog.Identifier ident, String sql, String currentCatalog, String[] currentNamespace, org.apache.spark.sql.types.StructType schema, String[] queryColumnNames, String[] columnAliases, String[] columnComments, Map<String, String> properties) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException, org.apache.spark.sql.catalyst.analysis.NoSuchViewExceptionDescription copied from interface:SupportsReplaceView
Replace a view in the catalog- Parameters:
ident
- a view identifiersql
- the SQL text that defines the viewcurrentCatalog
- the current catalogcurrentNamespace
- the current namespaceschema
- the view query output schemaqueryColumnNames
- the query column namescolumnAliases
- the column aliasescolumnComments
- the column commentsproperties
- the view properties- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
- If the identifier namespace does not exist (optional)org.apache.spark.sql.catalyst.analysis.NoSuchViewException
- If the view doesn't exist or is a table
-
alterView
public org.apache.spark.sql.connector.catalog.View alterView(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.connector.catalog.ViewChange... changes) throws org.apache.spark.sql.catalyst.analysis.NoSuchViewException, IllegalArgumentException - Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchViewException
IllegalArgumentException
-
dropView
public boolean dropView(org.apache.spark.sql.connector.catalog.Identifier ident) -
renameView
public void renameView(org.apache.spark.sql.connector.catalog.Identifier fromIdentifier, org.apache.spark.sql.connector.catalog.Identifier toIdentifier) throws org.apache.spark.sql.catalyst.analysis.NoSuchViewException, org.apache.spark.sql.catalyst.analysis.ViewAlreadyExistsException - Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchViewException
org.apache.spark.sql.catalyst.analysis.ViewAlreadyExistsException
-
initialize
public final void initialize(String name, org.apache.spark.sql.util.CaseInsensitiveStringMap options) - Specified by:
initialize
in interfaceorg.apache.spark.sql.connector.catalog.CatalogPlugin
-
name
-
icebergCatalog
Description copied from interface:HasIcebergCatalog
Returns the underlyingCatalog
backing this Spark Catalog -
loadProcedure
public Procedure loadProcedure(org.apache.spark.sql.connector.catalog.Identifier ident) throws NoSuchProcedureException Description copied from interface:ProcedureCatalog
Load astored procedure
byidentifier
.- Specified by:
loadProcedure
in interfaceProcedureCatalog
- Parameters:
ident
- a stored procedure identifier- Returns:
- the stored procedure's metadata
- Throws:
NoSuchProcedureException
- if there is no matching stored procedure
-
isFunctionNamespace
-
isExistingNamespace
-
useNullableQuerySchema
public boolean useNullableQuerySchema()- Specified by:
useNullableQuerySchema
in interfaceorg.apache.spark.sql.connector.catalog.TableCatalog
-
listFunctions
default org.apache.spark.sql.connector.catalog.Identifier[] listFunctions(String[] namespace) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException - Specified by:
listFunctions
in interfaceorg.apache.spark.sql.connector.catalog.FunctionCatalog
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
-
loadFunction
default org.apache.spark.sql.connector.catalog.functions.UnboundFunction loadFunction(org.apache.spark.sql.connector.catalog.Identifier ident) throws org.apache.spark.sql.catalyst.analysis.NoSuchFunctionException - Specified by:
loadFunction
in interfaceorg.apache.spark.sql.connector.catalog.FunctionCatalog
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchFunctionException
-