pyspark.sql.Catalog.cacheTable

Catalog.cacheTable(tableName: str, storageLevel: Optional[pyspark.storagelevel.StorageLevel] = None) → None[source]

Caches the specified table in-memory or with given storage level. Default MEMORY_AND_DISK.

New in version 2.0.0.

Parameters
tableNamestr

name of the table to get.

Changed in version 3.4.0: Allow tableName to be qualified with catalog name.

storageLevelStorageLevel

storage level to set for persistence.

Changed in version 3.5.0: Allow to specify storage level.

Examples

>>>
>>> _ = spark.sql("DROP TABLE IF EXISTS tbl1")
>>> _ = spark.sql("CREATE TABLE tbl1 (name STRING, age INT) USING parquet")
>>> spark.catalog.cacheTable("tbl1")

or

>>>
>>> spark.catalog.cacheTable("tbl1", StorageLevel.OFF_HEAP)

Throw an analysis exception when the table does not exist.

>>>
>>> spark.catalog.cacheTable("not_existing_table")
Traceback (most recent call last):
    ...
AnalysisException: ...

Using the fully qualified name for the table.

>>>
>>> spark.catalog.cacheTable("spark_catalog.default.tbl1")
>>> spark.catalog.uncacheTable("tbl1")
>>> _ = spark.sql("DROP TABLE tbl1")