pyspark.sql.
Catalog
User-facing catalog API, accessible through SparkSession.catalog.
This is a thin wrapper around its Scala implementation org.apache.spark.sql.catalog.Catalog.
Changed in version 3.4.0: Supports Spark Connect.
Methods
cacheTable(tableName)
cacheTable
Caches the specified table in-memory.
clearCache()
clearCache
Removes all cached tables from the in-memory cache.
createExternalTable(tableName[, path, …])
createExternalTable
Creates a table based on the dataset in a data source.
createTable(tableName[, path, source, …])
createTable
currentCatalog()
currentCatalog
Returns the current default catalog in this session.
currentDatabase()
currentDatabase
Returns the current default database in this session.
databaseExists(dbName)
databaseExists
Check if the database with the specified name exists.
dropGlobalTempView(viewName)
dropGlobalTempView
Drops the global temporary view with the given view name in the catalog.
dropTempView(viewName)
dropTempView
Drops the local temporary view with the given view name in the catalog.
functionExists(functionName[, dbName])
functionExists
Check if the function with the specified name exists.
getDatabase(dbName)
getDatabase
Get the database with the specified name.
getFunction(functionName)
getFunction
Get the function with the specified name.
getTable(tableName)
getTable
Get the table or view with the specified name.
isCached(tableName)
isCached
Returns true if the table is currently cached in-memory.
listCatalogs()
listCatalogs
Returns a list of catalogs in this session.
listColumns(tableName[, dbName])
listColumns
Returns a list of columns for the given table/view in the specified database.
listDatabases()
listDatabases
Returns a list of databases available across all sessions.
listFunctions([dbName])
listFunctions
Returns a list of functions registered in the specified database.
listTables([dbName])
listTables
Returns a list of tables/views in the specified database.
recoverPartitions(tableName)
recoverPartitions
Recovers all the partitions of the given table and updates the catalog.
refreshByPath(path)
refreshByPath
Invalidates and refreshes all the cached data (and the associated metadata) for any DataFrame that contains the given data source path.
refreshTable(tableName)
refreshTable
Invalidates and refreshes all the cached data and metadata of the given table.
registerFunction(name, f[, returnType])
registerFunction
An alias for spark.udf.register().
spark.udf.register()
setCurrentCatalog(catalogName)
setCurrentCatalog
Sets the current default catalog in this session.
setCurrentDatabase(dbName)
setCurrentDatabase
Sets the current default database in this session.
tableExists(tableName[, dbName])
tableExists
Check if the table or view with the specified name exists.
uncacheTable(tableName)
uncacheTable
Removes the specified table from the in-memory cache.