Class SparkExecutorCache

java.lang.Object
org.apache.iceberg.spark.SparkExecutorCache

public class SparkExecutorCache extends Object
An executor cache for reducing the computation and IO overhead in tasks.

The cache is configured and controlled through Spark SQL properties. It supports both limits on the total cache size and maximum size for individual entries. Additionally, it implements automatic eviction of entries after a specified duration of inactivity. The cache will respect the SQL configuration valid at the time of initialization. All subsequent changes to the configuration will have no effect.

The cache is accessed and populated via getOrLoad(String, String, Supplier, long). If the value is not present in the cache, it is computed using the provided supplier and stored in the cache, subject to the defined size constraints. When a key is added, it must be associated with a particular group ID. Once the group is no longer needed, it is recommended to explicitly invalidate its state by calling invalidate(String) instead of relying on automatic eviction.

Note that this class employs the singleton pattern to ensure only one cache exists per JVM.

  • Method Details

    • getOrCreate

      public static SparkExecutorCache getOrCreate()
      Returns the cache if created or creates and returns it.

      Note this method returns null if caching is disabled.

    • get

      public static SparkExecutorCache get()
      Returns the cache if already created or null otherwise.
    • maxEntrySize

      public long maxEntrySize()
      Returns the max entry size in bytes that will be considered for caching.
    • getOrLoad

      public <V> V getOrLoad(String group, String key, Supplier<V> valueSupplier, long valueSize)
      Gets the cached value for the key or populates the cache with a new mapping.
      Parameters:
      group - a group ID
      key - a cache key
      valueSupplier - a supplier to compute the value
      valueSize - an estimated memory size of the value in bytes
      Returns:
      the cached or computed value
    • invalidate

      public void invalidate(String group)
      Invalidates all keys associated with the given group ID.
      Parameters:
      group - a group ID