abstract class Catalog extends AnyRef
Catalog interface for Spark. To access this, use SparkSession.catalog
.
- Annotations
- @Stable()
- Source
- Catalog.scala
- Since
2.0.0
- Alphabetic
- By Inheritance
- Catalog
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
- new Catalog()
Abstract Value Members
-
abstract
def
cacheTable(tableName: String, storageLevel: StorageLevel): Unit
Caches the specified table with the given storage level.
Caches the specified table with the given storage level.
- tableName
is either a qualified or unqualified name that designates a table/view. If no database identifier is provided, it refers to a temporary view or a table/view in the current database.
- storageLevel
storage level to cache table.
- Since
2.3.0
-
abstract
def
cacheTable(tableName: String): Unit
Caches the specified table in-memory.
Caches the specified table in-memory.
- tableName
is either a qualified or unqualified name that designates a table/view. If no database identifier is provided, it refers to a temporary view or a table/view in the current database.
- Since
2.0.0
-
abstract
def
clearCache(): Unit
Removes all cached tables from the in-memory cache.
Removes all cached tables from the in-memory cache.
- Since
2.0.0
-
abstract
def
createTable(tableName: String, source: String, schema: StructType, description: String, options: Map[String, String]): DataFrame
(Scala-specific) Create a table based on the dataset in a data source, a schema and a set of options.
(Scala-specific) Create a table based on the dataset in a data source, a schema and a set of options. Then, returns the corresponding DataFrame.
- tableName
is either a qualified or unqualified name that designates a table. If no database identifier is provided, it refers to a table in the current database.
- Since
3.1.0
-
abstract
def
createTable(tableName: String, source: String, schema: StructType, options: Map[String, String]): DataFrame
(Scala-specific) Create a table based on the dataset in a data source, a schema and a set of options.
(Scala-specific) Create a table based on the dataset in a data source, a schema and a set of options. Then, returns the corresponding DataFrame.
- tableName
is either a qualified or unqualified name that designates a table. If no database identifier is provided, it refers to a table in the current database.
- Since
2.2.0
-
abstract
def
createTable(tableName: String, source: String, description: String, options: Map[String, String]): DataFrame
(Scala-specific) Creates a table based on the dataset in a data source and a set of options.
(Scala-specific) Creates a table based on the dataset in a data source and a set of options. Then, returns the corresponding DataFrame.
- tableName
is either a qualified or unqualified name that designates a table. If no database identifier is provided, it refers to a table in the current database.
- Since
3.1.0
-
abstract
def
createTable(tableName: String, source: String, options: Map[String, String]): DataFrame
(Scala-specific) Creates a table based on the dataset in a data source and a set of options.
(Scala-specific) Creates a table based on the dataset in a data source and a set of options. Then, returns the corresponding DataFrame.
- tableName
is either a qualified or unqualified name that designates a table. If no database identifier is provided, it refers to a table in the current database.
- Since
2.2.0
-
abstract
def
createTable(tableName: String, path: String, source: String): DataFrame
Creates a table from the given path based on a data source and returns the corresponding DataFrame.
Creates a table from the given path based on a data source and returns the corresponding DataFrame.
- tableName
is either a qualified or unqualified name that designates a table. If no database identifier is provided, it refers to a table in the current database.
- Since
2.2.0
-
abstract
def
createTable(tableName: String, path: String): DataFrame
Creates a table from the given path and returns the corresponding DataFrame.
Creates a table from the given path and returns the corresponding DataFrame. It will use the default data source configured by spark.sql.sources.default.
- tableName
is either a qualified or unqualified name that designates a table. If no database identifier is provided, it refers to a table in the current database.
- Since
2.2.0
-
abstract
def
currentCatalog(): String
Returns the current catalog in this session.
Returns the current catalog in this session.
- Since
3.4.0
-
abstract
def
currentDatabase: String
Returns the current database (namespace) in this session.
Returns the current database (namespace) in this session.
- Since
2.0.0
-
abstract
def
databaseExists(dbName: String): Boolean
Check if the database (namespace) with the specified name exists (the name can be qualified with catalog).
Check if the database (namespace) with the specified name exists (the name can be qualified with catalog).
- Since
2.1.0
-
abstract
def
dropGlobalTempView(viewName: String): Boolean
Drops the global temporary view with the given view name in the catalog.
Drops the global temporary view with the given view name in the catalog. If the view has been cached before, then it will also be uncached.
Global temporary view is cross-session. Its lifetime is the lifetime of the Spark application, i.e. it will be automatically dropped when the application terminates. It's tied to a system preserved database
global_temp
, and we must use the qualified name to refer a global temp view, e.g.SELECT * FROM global_temp.view1
.- viewName
the unqualified name of the temporary view to be dropped.
- returns
true if the view is dropped successfully, false otherwise.
- Since
2.1.0
-
abstract
def
dropTempView(viewName: String): Boolean
Drops the local temporary view with the given view name in the catalog.
Drops the local temporary view with the given view name in the catalog. If the view has been cached before, then it will also be uncached.
Local temporary view is session-scoped. Its lifetime is the lifetime of the session that created it, i.e. it will be automatically dropped when the session terminates. It's not tied to any databases, i.e. we can't use
db1.view1
to reference a local temporary view.Note that, the return type of this method was Unit in Spark 2.0, but changed to Boolean in Spark 2.1.
- viewName
the name of the temporary view to be dropped.
- returns
true if the view is dropped successfully, false otherwise.
- Since
2.0.0
-
abstract
def
functionExists(dbName: String, functionName: String): Boolean
Check if the function with the specified name exists in the specified database under the Hive Metastore.
Check if the function with the specified name exists in the specified database under the Hive Metastore.
To check existence of functions in other catalogs, please use
functionExists(functionName)
with qualified function name instead.- dbName
is an unqualified name that designates a database.
- functionName
is an unqualified name that designates a function.
- Since
2.1.0
-
abstract
def
functionExists(functionName: String): Boolean
Check if the function with the specified name exists.
Check if the function with the specified name exists. This can either be a temporary function or a function.
- functionName
is either a qualified or unqualified name that designates a function. It follows the same resolution rule with SQL: search for built-in/temp functions first then functions in the current database (namespace).
- Since
2.1.0
-
abstract
def
getDatabase(dbName: String): Database
Get the database (namespace) with the specified name (can be qualified with catalog).
Get the database (namespace) with the specified name (can be qualified with catalog). This throws an AnalysisException when the database (namespace) cannot be found.
- Annotations
- @throws( "database does not exist" )
- Since
2.1.0
-
abstract
def
getFunction(dbName: String, functionName: String): Function
Get the function with the specified name in the specified database under the Hive Metastore.
Get the function with the specified name in the specified database under the Hive Metastore. This throws an AnalysisException when the function cannot be found.
To get functions in other catalogs, please use
getFunction(functionName)
with qualified function name instead.- dbName
is an unqualified name that designates a database.
- functionName
is an unqualified name that designates a function in the specified database
- Annotations
- @throws( ... )
- Since
2.1.0
-
abstract
def
getFunction(functionName: String): Function
Get the function with the specified name.
Get the function with the specified name. This function can be a temporary function or a function. This throws an AnalysisException when the function cannot be found.
- functionName
is either a qualified or unqualified name that designates a function. It follows the same resolution rule with SQL: search for built-in/temp functions first then functions in the current database (namespace).
- Annotations
- @throws( "function does not exist" )
- Since
2.1.0
-
abstract
def
getTable(dbName: String, tableName: String): Table
Get the table or view with the specified name in the specified database under the Hive Metastore.
Get the table or view with the specified name in the specified database under the Hive Metastore. This throws an AnalysisException when no Table can be found.
To get table/view in other catalogs, please use
getTable(tableName)
with qualified table/view name instead.- Annotations
- @throws( "database or table does not exist" )
- Since
2.1.0
-
abstract
def
getTable(tableName: String): Table
Get the table or view with the specified name.
Get the table or view with the specified name. This table can be a temporary view or a table/view. This throws an AnalysisException when no Table can be found.
- tableName
is either a qualified or unqualified name that designates a table/view. It follows the same resolution rule with SQL: search for temp views first then table/views in the current database (namespace).
- Annotations
- @throws( "table does not exist" )
- Since
2.1.0
-
abstract
def
isCached(tableName: String): Boolean
Returns true if the table is currently cached in-memory.
Returns true if the table is currently cached in-memory.
- tableName
is either a qualified or unqualified name that designates a table/view. If no database identifier is provided, it refers to a temporary view or a table/view in the current database.
- Since
2.0.0
-
abstract
def
listCatalogs(): Dataset[CatalogMetadata]
Returns a list of catalogs available in this session.
Returns a list of catalogs available in this session.
- Since
3.4.0
-
abstract
def
listColumns(dbName: String, tableName: String): Dataset[Column]
Returns a list of columns for the given table/view in the specified database under the Hive Metastore.
Returns a list of columns for the given table/view in the specified database under the Hive Metastore.
To list columns for table/view in other catalogs, please use
listColumns(tableName)
with qualified table/view name instead.- dbName
is an unqualified name that designates a database.
- tableName
is an unqualified name that designates a table/view.
- Annotations
- @throws( "database or table does not exist" )
- Since
2.0.0
-
abstract
def
listColumns(tableName: String): Dataset[Column]
Returns a list of columns for the given table/view or temporary view.
Returns a list of columns for the given table/view or temporary view.
- tableName
is either a qualified or unqualified name that designates a table/view. It follows the same resolution rule with SQL: search for temp views first then table/views in the current database (namespace).
- Annotations
- @throws( "table does not exist" )
- Since
2.0.0
-
abstract
def
listDatabases(): Dataset[Database]
Returns a list of databases (namespaces) available within the current catalog.
Returns a list of databases (namespaces) available within the current catalog.
- Since
2.0.0
-
abstract
def
listFunctions(dbName: String): Dataset[Function]
Returns a list of functions registered in the specified database (namespace) (the name can be qualified with catalog).
Returns a list of functions registered in the specified database (namespace) (the name can be qualified with catalog). This includes all built-in and temporary functions.
- Annotations
- @throws( "database does not exist" )
- Since
2.0.0
-
abstract
def
listFunctions(): Dataset[Function]
Returns a list of functions registered in the current database (namespace).
Returns a list of functions registered in the current database (namespace). This includes all temporary functions.
- Since
2.0.0
-
abstract
def
listTables(dbName: String): Dataset[Table]
Returns a list of tables/views in the specified database (namespace) (the name can be qualified with catalog).
Returns a list of tables/views in the specified database (namespace) (the name can be qualified with catalog). This includes all temporary views.
- Annotations
- @throws( "database does not exist" )
- Since
2.0.0
-
abstract
def
listTables(): Dataset[Table]
Returns a list of tables/views in the current database (namespace).
Returns a list of tables/views in the current database (namespace). This includes all temporary views.
- Since
2.0.0
-
abstract
def
recoverPartitions(tableName: String): Unit
Recovers all the partitions in the directory of a table and update the catalog.
Recovers all the partitions in the directory of a table and update the catalog. Only works with a partitioned table, and not a view.
- tableName
is either a qualified or unqualified name that designates a table. If no database identifier is provided, it refers to a table in the current database.
- Since
2.1.1
-
abstract
def
refreshByPath(path: String): Unit
Invalidates and refreshes all the cached data (and the associated metadata) for any
Dataset
that contains the given data source path.Invalidates and refreshes all the cached data (and the associated metadata) for any
Dataset
that contains the given data source path. Path matching is by prefix, i.e. "/" would invalidate everything that is cached.- Since
2.0.0
-
abstract
def
refreshTable(tableName: String): Unit
Invalidates and refreshes all the cached data and metadata of the given table.
Invalidates and refreshes all the cached data and metadata of the given table. For performance reasons, Spark SQL or the external data source library it uses might cache certain metadata about a table, such as the location of blocks. When those change outside of Spark SQL, users should call this function to invalidate the cache.
If this table is cached as an InMemoryRelation, drop the original cached version and make the new version cached lazily.
- tableName
is either a qualified or unqualified name that designates a table/view. If no database identifier is provided, it refers to a temporary view or a table/view in the current database.
- Since
2.0.0
-
abstract
def
setCurrentCatalog(catalogName: String): Unit
Sets the current catalog in this session.
Sets the current catalog in this session.
- Since
3.4.0
-
abstract
def
setCurrentDatabase(dbName: String): Unit
Sets the current database (namespace) in this session.
Sets the current database (namespace) in this session.
- Since
2.0.0
-
abstract
def
tableExists(dbName: String, tableName: String): Boolean
Check if the table or view with the specified name exists in the specified database under the Hive Metastore.
Check if the table or view with the specified name exists in the specified database under the Hive Metastore.
To check existence of table/view in other catalogs, please use
tableExists(tableName)
with qualified table/view name instead.- dbName
is an unqualified name that designates a database.
- tableName
is an unqualified name that designates a table.
- Since
2.1.0
-
abstract
def
tableExists(tableName: String): Boolean
Check if the table or view with the specified name exists.
Check if the table or view with the specified name exists. This can either be a temporary view or a table/view.
- tableName
is either a qualified or unqualified name that designates a table/view. It follows the same resolution rule with SQL: search for temp views first then table/views in the current database (namespace).
- Since
2.1.0
-
abstract
def
uncacheTable(tableName: String): Unit
Removes the specified table from the in-memory cache.
Removes the specified table from the in-memory cache.
- tableName
is either a qualified or unqualified name that designates a table/view. If no database identifier is provided, it refers to a temporary view or a table/view in the current database.
- Since
2.0.0
Concrete Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
createTable(tableName: String, source: String, schema: StructType, description: String, options: Map[String, String]): DataFrame
Create a table based on the dataset in a data source, a schema and a set of options.
Create a table based on the dataset in a data source, a schema and a set of options. Then, returns the corresponding DataFrame.
- tableName
is either a qualified or unqualified name that designates a table. If no database identifier is provided, it refers to a table in the current database.
- Since
3.1.0
-
def
createTable(tableName: String, source: String, schema: StructType, options: Map[String, String]): DataFrame
Create a table based on the dataset in a data source, a schema and a set of options.
Create a table based on the dataset in a data source, a schema and a set of options. Then, returns the corresponding DataFrame.
- tableName
is either a qualified or unqualified name that designates a table. If no database identifier is provided, it refers to a table in the current database.
- Since
2.2.0
-
def
createTable(tableName: String, source: String, description: String, options: Map[String, String]): DataFrame
Creates a table based on the dataset in a data source and a set of options.
Creates a table based on the dataset in a data source and a set of options. Then, returns the corresponding DataFrame.
- tableName
is either a qualified or unqualified name that designates a table. If no database identifier is provided, it refers to a table in the current database.
- Since
3.1.0
-
def
createTable(tableName: String, source: String, options: Map[String, String]): DataFrame
Creates a table based on the dataset in a data source and a set of options.
Creates a table based on the dataset in a data source and a set of options. Then, returns the corresponding DataFrame.
- tableName
is either a qualified or unqualified name that designates a table. If no database identifier is provided, it refers to a table in the current database.
- Since
2.2.0
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
Deprecated Value Members
-
def
createExternalTable(tableName: String, source: String, schema: StructType, options: Map[String, String]): DataFrame
(Scala-specific) Create a table from the given path based on a data source, a schema and a set of options.
(Scala-specific) Create a table from the given path based on a data source, a schema and a set of options. Then, returns the corresponding DataFrame.
- tableName
is either a qualified or unqualified name that designates a table. If no database identifier is provided, it refers to a table in the current database.
- Annotations
- @deprecated
- Deprecated
(Since version 2.2.0) use createTable instead.
- Since
2.0.0
-
def
createExternalTable(tableName: String, source: String, schema: StructType, options: Map[String, String]): DataFrame
Create a table from the given path based on a data source, a schema and a set of options.
Create a table from the given path based on a data source, a schema and a set of options. Then, returns the corresponding DataFrame.
- tableName
is either a qualified or unqualified name that designates a table. If no database identifier is provided, it refers to a table in the current database.
- Annotations
- @deprecated
- Deprecated
(Since version 2.2.0) use createTable instead.
- Since
2.0.0
-
def
createExternalTable(tableName: String, source: String, options: Map[String, String]): DataFrame
(Scala-specific) Creates a table from the given path based on a data source and a set of options.
(Scala-specific) Creates a table from the given path based on a data source and a set of options. Then, returns the corresponding DataFrame.
- tableName
is either a qualified or unqualified name that designates a table. If no database identifier is provided, it refers to a table in the current database.
- Annotations
- @deprecated
- Deprecated
(Since version 2.2.0) use createTable instead.
- Since
2.0.0
-
def
createExternalTable(tableName: String, source: String, options: Map[String, String]): DataFrame
Creates a table from the given path based on a data source and a set of options.
Creates a table from the given path based on a data source and a set of options. Then, returns the corresponding DataFrame.
- tableName
is either a qualified or unqualified name that designates a table. If no database identifier is provided, it refers to a table in the current database.
- Annotations
- @deprecated
- Deprecated
(Since version 2.2.0) use createTable instead.
- Since
2.0.0
-
def
createExternalTable(tableName: String, path: String, source: String): DataFrame
Creates a table from the given path based on a data source and returns the corresponding DataFrame.
Creates a table from the given path based on a data source and returns the corresponding DataFrame.
- tableName
is either a qualified or unqualified name that designates a table. If no database identifier is provided, it refers to a table in the current database.
- Annotations
- @deprecated
- Deprecated
(Since version 2.2.0) use createTable instead.
- Since
2.0.0
-
def
createExternalTable(tableName: String, path: String): DataFrame
Creates a table from the given path and returns the corresponding DataFrame.
Creates a table from the given path and returns the corresponding DataFrame. It will use the default data source configured by spark.sql.sources.default.
- tableName
is either a qualified or unqualified name that designates a table. If no database identifier is provided, it refers to a table in the current database.
- Annotations
- @deprecated
- Deprecated
(Since version 2.2.0) use createTable instead.
- Since
2.0.0