Catalog#

Catalog.analyzeTable(tableName[, noScan])

Computes table statistics (same as SQL ANALYZE TABLE COMPUTE STATISTICS).

Catalog.cacheTable(tableName[, storageLevel])

Caches the specified table in-memory or with given storage level.

Catalog.clearCache()

Removes all cached tables from the in-memory cache.

Catalog.createDatabase(dbName[, ...])

Creates a namespace (database/schema).

Catalog.createExternalTable(tableName[, ...])

Creates a table based on the dataset in a data source.

Catalog.createTable(tableName[, path, ...])

Creates a table based on the dataset in a data source.

Catalog.currentCatalog()

Returns the current catalog in this session.

Catalog.currentDatabase()

Returns the current database (namespace) in this session.

Catalog.databaseExists(dbName)

Check if the database with the specified name exists.

Catalog.dropDatabase(dbName[, ifExists, cascade])

Drops a namespace.

Catalog.dropGlobalTempView(viewName)

Drops the global temporary view with the given view name in the catalog.

Catalog.dropTable(tableName[, ifExists, purge])

Drops a persistent table.

Catalog.dropTempView(viewName)

Drops the local temporary view with the given view name in the catalog.

Catalog.dropView(viewName[, ifExists])

Drops a persistent view.

Catalog.functionExists(functionName[, dbName])

Check if the function with the specified name exists.

Catalog.getCreateTableString(tableName[, ...])

Returns the SHOW CREATE TABLE DDL string for a relation.

Catalog.getDatabase(dbName)

Get the database with the specified name.

Catalog.getFunction(functionName)

Get the function with the specified name.

Catalog.getTable(tableName)

Get the table or view with the specified name.

Catalog.getTableProperties(tableName)

Returns all table properties as a dict (same as SHOW TBLPROPERTIES).

Catalog.isCached(tableName)

Returns true if the table is currently cached in-memory.

Catalog.listCachedTables()

Lists named in-memory cache entries (same as SHOW CACHED TABLES).

Catalog.listCatalogs([pattern])

Returns a list of catalogs available in this session.

Catalog.listColumns(tableName[, dbName])

Returns a list of columns for the given table/view in the specified database.

Catalog.listDatabases([pattern])

Returns a list of databases (namespaces) available within the current catalog.

Catalog.listFunctions([dbName, pattern])

Returns a list of functions registered in the current database (namespace), or in the database given by dbName when provided (the name may be qualified with catalog).

Catalog.listPartitions(tableName)

Lists partition value strings for a table (same as SHOW PARTITIONS).

Catalog.listTables([dbName, pattern])

Returns a list of tables/views in the current database (namespace), or in the database given by dbName when provided (the name may be qualified with catalog).

Catalog.listViews([dbName, pattern])

Lists views in a namespace.

Catalog.recoverPartitions(tableName)

Recovers all the partitions in the directory of a table and updates the catalog.

Catalog.refreshByPath(path)

Invalidates and refreshes all the cached data (and the associated metadata) for any DataFrame that contains the given data source path.

Catalog.refreshTable(tableName)

Invalidates and refreshes all the cached data and metadata of the given table.

Catalog.registerFunction(name, f[, returnType])

An alias for spark.udf.register().

Catalog.setCurrentCatalog(catalogName)

Sets the current catalog in this session.

Catalog.setCurrentDatabase(dbName)

Sets the current database (namespace) in this session.

Catalog.tableExists(tableName[, dbName])

Check if the table or view with the specified name exists.

Catalog.truncateTable(tableName)

Truncates a table (removes all data from the table; not supported for views).

Catalog.uncacheTable(tableName)

Removes the specified table from the in-memory cache.