Catalog properties🔗

Common properties🔗

Iceberg catalogs support using catalog properties to configure catalog behaviors. Here is a list of commonly used catalog properties:

Property	Default	Description
catalog-impl	null	a custom `Catalog` implementation to use by an engine
io-impl	null	a custom `FileIO` implementation to use in a catalog
warehouse	null	the root path of the data warehouse
uri	null	a URI string, such as Hive metastore URI
clients	2	client pool size
cache-enabled	true	Whether to cache catalog entries
cache.expiration-interval-ms	30000	How long catalog entries are locally cached, in milliseconds; 0 disables caching, negative values disable expiration
metrics-reporter-impl	org.apache.iceberg.metrics.LoggingMetricsReporter	Custom `MetricsReporter` implementation to use in a catalog. See the Metrics reporting section for additional details
unique-table-location	false	Whether to use a unique location for new tables
encryption.kms-impl	null	a custom `KeyManagementClient` implementation to use in a catalog for interactions with KMS (key management service). See the Encryption document for additional details

HadoopCatalog and HiveCatalog can access the properties in their constructors. Any other custom catalog can access the properties by implementing Catalog.initialize(catalogName, catalogProperties). The properties can be manually constructed or passed in from a compute engine like Spark or Flink. Spark uses its session properties as catalog properties, see more details in the Spark configuration section. Flink passes in catalog properties through CREATE CATALOG statement, see more details in the Flink section.

REST catalog properties🔗

The following properties configure the behavior of the REST catalog client.

Property	Default	Description
`snapshot-loading-mode`	`ALL`	Controls how snapshots are loaded from the REST server. Supported values: `ALL` (load all snapshots), `REFS` (load only referenced snapshots).
`rest-metrics-reporting-enabled`	`true`	Whether to enable metrics reporting to the REST server.
`view-endpoints-supported`	`false`	For backwards compatibility with older REST servers. Set to `true` if the server supports view endpoints but doesn't send the `endpoints` field in the ConfigResponse.
`rest-page-size`	null	The page size to use when listing namespaces, tables, or other paginated resources.
`namespace-separator`	`%1F`	The separator character used for namespace levels when communicating with the REST server.
`scan-planning-mode`	`CLIENT`	Controls where scan planning is performed. Supported values: `CLIENT` (client-side planning), `SERVER` (server-side planning). Can be overridden per-table by the server in LoadTableResponse.

Table cache properties🔗

The following properties configure the table cache used for freshness-aware table loading. Note, this cache is different from the one that can be configured at catalog level in general.

Property	Default	Description
`rest-table-cache.expire-after-write-ms`	`300000` (5 min)	Time in milliseconds after which cached table entries expire.
`rest-table-cache.max-entries`	`100`	Maximum number of table entries to cache.

Auth properties🔗

The following catalog properties configure authentication for the REST catalog. They support Basic, OAuth2, SigV4, and Google authentication.

REST auth properties🔗

Property	Default	Description
`rest.auth.type`	`none`	Authentication mechanism for REST catalog access. Supported values: `none`, `basic`, `oauth2`, `sigv4`, `google`.
`rest.auth.basic.username`	null	Username for Basic authentication. Required if `rest.auth.type` = `basic`.
`rest.auth.basic.password`	null	Password for Basic authentication. Required if `rest.auth.type` = `basic`.
`rest.auth.sigv4.delegate-auth-type`	`oauth2`	Auth type to delegate to after `sigv4` signing.

OAuth2 auth properties🔗

Required and optional properties to include while using oauth2 authentication

Property	Default	Description
`token`	null	A Bearer token to interact with the server. Either `token` or `credential` is required.
`credential`	null	Credential string in the form of `client_id:client_secret` to exchange for a token in the OAuth2 client credentials flow. Either `token` or `credential` is required.
`oauth2-server-uri`	`v1/oauth/tokens`	OAuth2 token endpoint URI. Required if the REST catalog is not the OAuth2 authentication server.
`token-expires-in-ms`	3600000 (1 hour)	Time in milliseconds after which a bearer token is considered expired. Used to decide when to refresh or re-exchange a token.
`token-refresh-enabled`	true	Determines whether tokens are automatically refreshed when expiration details are available.
`token-exchange-enabled`	true	Determines whether to use the token exchange flow to acquire new tokens. Disabling this will allow fallback to the client credential flow.
`scope`	`catalog`	Additional scope for `oauth2`.
`audience`	null	Optional param to specify token `audience`
`resource`	null	Optional param to specify `resource`

Google auth properties🔗

Required and optional properties to include while using google authentication

Property	Default	Description
`gcp.auth.credentials-path`	Application Default Credentials (ADC)	Path to a service account JSON key file.
`gcp.auth.credentials-json`	Application Default Credentials (ADC)	JSON string of a service account credential.
`gcp.auth.scopes`	`https://www.googleapis.com/auth/cloud-platform`	Comma-separated list of OAuth scopes to request.

Lock catalog properties🔗

Here are the catalog properties related to locking. They are used by some catalog implementations to control the locking behavior during commits.

Property	Default	Description
lock-impl	null	a custom implementation of the lock manager, the actual interface depends on the catalog used
lock.table	null	an auxiliary table for locking, such as in AWS DynamoDB lock manager
lock.acquire-interval-ms	5000 (5 s)	the interval to wait between each attempt to acquire a lock
lock.acquire-timeout-ms	180000 (3 min)	the maximum time to try acquiring a lock
lock.heartbeat-interval-ms	3000 (3 s)	the interval to wait between each heartbeat after acquiring a lock
lock.heartbeat-timeout-ms	15000 (15 s)	the maximum time without a heartbeat to consider a lock expired

Hadoop configuration🔗

HadoopTables Lock Configuration🔗

When using HadoopTables (tables without a catalog), lock properties from the Lock catalog properties section can be configured by prefixing them with iceberg.tables.hadoop.. This ensures atomic commits on file systems like S3 that lack native write mutual exclusion.

Info

To use DynamoDB as a lock manager with HadoopTables, set iceberg.tables.hadoop.lock-impl to org.apache.iceberg.aws.dynamodb.DynamoDbLockManager and iceberg.tables.hadoop.lock.table to your DynamoDB table name. See DynamoDB Lock Manager for more details.

Hive Metastore Configuration🔗

The following properties from the Hadoop configuration are used by the Hive Metastore connector. The HMS table locking is a 2-step process:

Lock Creation: Create lock in HMS and queue for acquisition
Lock Check: Check if lock successfully acquired

Property	Default	Description
iceberg.hive.client-pool-size	5	The size of the Hive client pool when tracking tables in HMS
iceberg.hive.lock-creation-timeout-ms	180000 (3 min)	Maximum time in milliseconds to create a lock in the HMS
iceberg.hive.lock-creation-min-wait-ms	50	Minimum time in milliseconds between retries of creating the lock in the HMS
iceberg.hive.lock-creation-max-wait-ms	5000	Maximum time in milliseconds between retries of creating the lock in the HMS
iceberg.hive.lock-timeout-ms	180000 (3 min)	Maximum time in milliseconds to acquire a lock
iceberg.hive.lock-check-min-wait-ms	50	Minimum time in milliseconds between checking the acquisition of the lock
iceberg.hive.lock-check-max-wait-ms	5000	Maximum time in milliseconds between checking the acquisition of the lock
iceberg.hive.lock-heartbeat-interval-ms	240000 (4 min)	The heartbeat interval for the HMS locks.
iceberg.hive.metadata-refresh-max-retries	2	Maximum number of retries when the metadata file is missing
iceberg.hive.table-level-lock-evict-ms	600000 (10 min)	The timeout for the JVM table lock is
iceberg.engine.hive.lock-enabled	true	Use HMS locks to ensure atomicity of commits

Note: iceberg.hive.lock-check-max-wait-ms and iceberg.hive.lock-heartbeat-interval-ms should be less than the transaction timeout of the Hive Metastore (hive.txn.timeout or metastore.txn.timeout in the newer versions). Otherwise, the heartbeats on the lock (which happens during the lock checks) would end up expiring in the Hive Metastore before the lock is retried from Iceberg.

Warn: Setting iceberg.engine.hive.lock-enabled=false will cause HiveCatalog to commit to tables without using Hive locks. This should only be set to false if all following conditions are met:

HIVE-26882 is available on the Hive Metastore server
HIVE-28121 is available on the Hive Metastore server, if it is backed by MySQL or MariaDB
All other HiveCatalogs committing to tables that this HiveCatalog commits to are also on Iceberg 1.3 or later
All other HiveCatalogs committing to tables that this HiveCatalog commits to have also disabled Hive locks on commit.

Failing to ensure these conditions risks corrupting the table.

Even with iceberg.engine.hive.lock-enabled set to false, a HiveCatalog can still use locks for individual tables by setting the table property engine.hive.lock-enabled=true. This is useful in the case where other HiveCatalogs cannot be upgraded and set to commit without using Hive locks.