Packages

c

org.apache.spark.sql.jdbc

JdbcDialect

abstract class JdbcDialect extends Serializable with Logging

Developer API

Encapsulates everything (extensions, workarounds, quirks) to handle the SQL dialect of a certain database or jdbc driver. Lots of databases define types that aren't explicitly supported by the JDBC spec. Some JDBC drivers also report inaccurate information---for instance, BIT(n>1) being reported as a BIT type is quite common, even though BIT in JDBC is meant for single-bit values. Also, there does not appear to be a standard name for an unbounded string or binary type; we use BLOB and CLOB by default but override with database-specific alternatives when these are absent or do not behave correctly.

Currently, the only thing done by the dialect is type mapping. getCatalystType is used when reading from a JDBC table and getJDBCType is used when writing to a JDBC table. If getCatalystType returns null, the default type handling is used for the given JDBC type. Similarly, if getJDBCType returns (null, None), the default type handling is used for the given Catalyst type.

Annotations
@DeveloperApi()
Source
JdbcDialects.scala
Linear Supertypes
Logging, Serializable, AnyRef, Any
Known Subclasses
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. JdbcDialect
  2. Logging
  3. Serializable
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Instance Constructors

  1. new JdbcDialect()

Type Members

  1. implicit class LogStringContext extends AnyRef
    Definition Classes
    Logging

Abstract Value Members

  1. abstract def canHandle(url: String): Boolean

    Check if this dialect instance can handle a certain jdbc url.

    Check if this dialect instance can handle a certain jdbc url.

    url

    the jdbc url.

    returns

    True if the dialect can be applied on the given jdbc url.

    Exceptions thrown

    NullPointerException if the url is null.

Concrete Value Members

  1. def alterTable(tableName: String, changes: Seq[TableChange], dbMajorVersion: Int): Array[String]

    Alter an existing table.

    Alter an existing table.

    tableName

    The name of the table to be altered.

    changes

    Changes to apply to the table.

    returns

    The SQL statements to use for altering the table.

  2. def beforeFetch(connection: Connection, properties: Map[String, String]): Unit

    Override connection specific properties to run before a select is made.

    Override connection specific properties to run before a select is made. This is in place to allow dialects that need special treatment to optimize behavior.

    connection

    The connection object

    properties

    The connection properties. This is passed through from the relation.

  3. def classifyException(e: Throwable, errorClass: String, messageParameters: Map[String, String], description: String): AnalysisException

    Gets a dialect exception, classifies it and wraps it by AnalysisException.

    Gets a dialect exception, classifies it and wraps it by AnalysisException.

    e

    The dialect specific exception.

    errorClass

    The error class assigned in the case of an unclassified e

    messageParameters

    The message parameters of errorClass

    description

    The error description

    returns

    AnalysisException or its sub-class.

  4. def compileExpression(expr: Expression): Option[String]

    Converts V2 expression to String representing a SQL expression.

    Converts V2 expression to String representing a SQL expression.

    expr

    The V2 expression to be converted.

    returns

    Converted value.

    Annotations
    @Since("3.3.0")
  5. def compileValue(value: Any): Any

    Converts value to SQL expression.

    Converts value to SQL expression.

    value

    The value to be converted.

    returns

    Converted value.

    Annotations
    @Since("2.3.0")
  6. def convertJavaDateToDate(d: Date): Date

    Converts an instance of java.sql.Date to a custom java.sql.Date value.

    Converts an instance of java.sql.Date to a custom java.sql.Date value.

    d

    the date value returned from JDBC ResultSet getDate method.

    returns

    the date value after conversion

    Annotations
    @Since("4.0.0")
  7. def convertJavaTimestampToTimestamp(t: Timestamp): Timestamp

    Converts an instance of java.sql.Timestamp to a custom java.sql.Timestamp value.

    Converts an instance of java.sql.Timestamp to a custom java.sql.Timestamp value.

    t

    represents a specific instant in time based on the hybrid calendar which combines Julian and Gregorian calendars.

    returns

    the timestamp value to convert to

    Annotations
    @Since("3.5.0")
    Exceptions thrown

    IllegalArgumentException if t is null

  8. def convertJavaTimestampToTimestampNTZ(t: Timestamp): LocalDateTime

    Convert java.sql.Timestamp to a LocalDateTime representing the same wall-clock time as the value stored in a remote database.

    Convert java.sql.Timestamp to a LocalDateTime representing the same wall-clock time as the value stored in a remote database. JDBC dialects should override this function to provide implementations that suit their JDBC drivers.

    t

    Timestamp returned from JDBC driver getTimestamp method.

    returns

    A LocalDateTime representing the same wall clock time as the timestamp in database.

    Annotations
    @Since("3.5.0")
  9. def convertTimestampNTZToJavaTimestamp(ldt: LocalDateTime): Timestamp

    Converts a LocalDateTime representing a TimestampNTZ type to an instance of java.sql.Timestamp.

    Converts a LocalDateTime representing a TimestampNTZ type to an instance of java.sql.Timestamp.

    ldt

    representing a TimestampNTZType.

    returns

    A Java Timestamp representing this LocalDateTime.

    Annotations
    @Since("3.5.0")
  10. def createConnectionFactory(options: JDBCOptions): (Int) => Connection

    Returns a factory for creating connections to the given JDBC URL.

    Returns a factory for creating connections to the given JDBC URL. In general, creating a connection has nothing to do with JDBC partition id. But sometimes it is needed, such as a database with multiple shard nodes.

    options

    - JDBC options that contains url, table and other information.

    returns

    The factory method for creating JDBC connections with the RDD partition ID. -1 means the connection is being created at the driver side.

    Annotations
    @Since("3.3.0")
    Exceptions thrown

    IllegalArgumentException if the driver could not open a JDBC connection.

  11. def createIndex(indexName: String, tableIdent: Identifier, columns: Array[NamedReference], columnsProperties: Map[NamedReference, Map[String, String]], properties: Map[String, String]): String

    Build a create index SQL statement.

    Build a create index SQL statement.

    indexName

    the name of the index to be created

    tableIdent

    the table on which index to be created

    columns

    the columns on which index to be created

    columnsProperties

    the properties of the columns on which index to be created

    properties

    the properties of the index to be created

    returns

    the SQL statement to use for creating the index.

  12. def createSchema(statement: Statement, schema: String, comment: String): Unit

    Create schema with an optional comment.

    Create schema with an optional comment. Empty string means no comment.

  13. def createTable(statement: Statement, tableName: String, strSchema: String, options: JdbcOptionsInWrite): Unit

    Create the table if the table does not exist.

    Create the table if the table does not exist. To allow certain options to append when create a new table, which can be table_options or partition_options. E.g., "CREATE TABLE t (name string) ENGINE=InnoDB DEFAULT CHARSET=utf8"

    statement

    The Statement object used to execute SQL statements.

    tableName

    The name of the table to be created.

    strSchema

    The schema of the table to be created.

    options

    The JDBC options. It contains the create table option, which can be table_options or partition_options.

  14. def dropIndex(indexName: String, tableIdent: Identifier): String

    Build a drop index SQL statement.

    Build a drop index SQL statement.

    indexName

    the name of the index to be dropped.

    tableIdent

    the table on which index to be dropped.

    returns

    the SQL statement to use for dropping the index.

  15. def dropSchema(schema: String, cascade: Boolean): String
  16. def dropTable(table: String): String

    Build a SQL statement to drop the given table.

    Build a SQL statement to drop the given table.

    table

    the table name

    returns

    The SQL statement to use for drop the table.

    Annotations
    @Since("4.0.0")
  17. def functions: Seq[(String, UnboundFunction)]

    List the user-defined functions in jdbc dialect.

    List the user-defined functions in jdbc dialect.

    returns

    a sequence of tuple from function name to user-defined function.

  18. def getAddColumnQuery(tableName: String, columnName: String, dataType: String): String
  19. def getCatalystType(sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): Option[DataType]

    Get the custom datatype mapping for the given jdbc meta information.

    Get the custom datatype mapping for the given jdbc meta information.

    Guidelines for mapping database defined timestamps to Spark SQL timestamps:

    - TIMESTAMP WITHOUT TIME ZONE if preferTimestampNTZ -> org.apache.spark.sql.types.TimestampNTZType

    - TIMESTAMP WITHOUT TIME ZONE if !preferTimestampNTZ -> org.apache.spark.sql.types.TimestampType(LTZ)

    - If the TIMESTAMP cannot be distinguished by sqlType and typeName, preferTimestampNTZ is respected for now, but we may need to add another option in the future if necessary.

    sqlType

    Refers to java.sql.Types constants, or other constants defined by the target database, e.g. -101 is Oracle's TIMESTAMP WITH TIME ZONE type. This value is returned by java.sql.ResultSetMetaData#getColumnType.

    typeName

    The column type name used by the database (e.g. "BIGINT UNSIGNED"). This is sometimes used to determine the target data type when sqlType is not sufficient if multiple database types are conflated into a single id. This value is returned by java.sql.ResultSetMetaData#getColumnTypeName.

    size

    The size of the type, e.g. the maximum precision for numeric types, length for character string, etc. This value is returned by java.sql.ResultSetMetaData#getPrecision.

    md

    Result metadata associated with this type. This contains additional information from java.sql.ResultSetMetaData or user specified options. - isTimestampNTZ: Whether read a TIMESTAMP WITHOUT TIME ZONE value as org.apache.spark.sql.types.TimestampNTZType or not. This is configured by JDBCOptions.preferTimestampNTZ. - scale: The length of fractional part java.sql.ResultSetMetaData#getScale

    returns

    An option the actual DataType (subclasses of org.apache.spark.sql.types.DataType) or None if the default type mapping should be used.

  20. def getDayTimeIntervalAsMicros(daytimeStr: String): Long

    Converts a day-time interval string to a long value micros.

    Converts a day-time interval string to a long value micros.

    daytimeStr

    the day-time interval string

    returns

    the number of total microseconds in the interval

    Annotations
    @Since("4.0.0")
    Exceptions thrown

    IllegalArgumentException if the input string is invalid

  21. def getDeleteColumnQuery(tableName: String, columnName: String): String
  22. def getFullyQualifiedQuotedTableName(ident: Identifier): String

    Return the DB-specific quoted and fully qualified table name

    Return the DB-specific quoted and fully qualified table name

    Annotations
    @Since("3.5.0")
  23. def getJDBCType(dt: DataType): Option[JdbcType]

    Retrieve the jdbc / sql type for a given datatype.

    Retrieve the jdbc / sql type for a given datatype.

    dt

    The datatype (e.g. org.apache.spark.sql.types.StringType)

    returns

    The new JdbcType if there is an override for this DataType

  24. def getJdbcSQLQueryBuilder(options: JDBCOptions): JdbcSQLQueryBuilder

    Returns the SQL builder for the SELECT statement.

  25. def getLimitClause(limit: Integer): String

    Returns the LIMIT clause for the SELECT statement

  26. def getOffsetClause(offset: Integer): String

    Returns the OFFSET clause for the SELECT statement

  27. def getRenameColumnQuery(tableName: String, columnName: String, newName: String, dbMajorVersion: Int): String
  28. def getSchemaCommentQuery(schema: String, comment: String): String
  29. def getSchemaQuery(table: String): String

    The SQL query that should be used to discover the schema of a table.

    The SQL query that should be used to discover the schema of a table. It only needs to ensure that the result set has the same schema as the table, such as by calling "SELECT * ...". Dialects can override this method to return a query that works best in a particular database.

    table

    The name of the table.

    returns

    The SQL query to use for discovering the schema.

    Annotations
    @Since("2.1.0")
  30. def getTableCommentQuery(table: String, comment: String): String
  31. def getTableExistsQuery(table: String): String

    Get the SQL query that should be used to find if the given table exists.

    Get the SQL query that should be used to find if the given table exists. Dialects can override this method to return a query that works best in a particular database.

    table

    The name of the table.

    returns

    The SQL query to use for checking the table.

  32. def getTableSample(sample: TableSampleInfo): String
  33. def getTruncateQuery(table: String, cascade: Option[Boolean] = isCascadingTruncateTable()): String

    The SQL query that should be used to truncate a table.

    The SQL query that should be used to truncate a table. Dialects can override this method to return a query that is suitable for a particular database. For PostgreSQL, for instance, a different query is used to prevent "TRUNCATE" affecting other tables.

    table

    The table to truncate

    cascade

    Whether or not to cascade the truncation

    returns

    The SQL query to use for truncating a table

    Annotations
    @Since("2.4.0")
  34. def getTruncateQuery(table: String): String

    The SQL query that should be used to truncate a table.

    The SQL query that should be used to truncate a table. Dialects can override this method to return a query that is suitable for a particular database. For PostgreSQL, for instance, a different query is used to prevent "TRUNCATE" affecting other tables.

    table

    The table to truncate

    returns

    The SQL query to use for truncating a table

    Annotations
    @Since("2.3.0")
  35. def getUpdateColumnNullabilityQuery(tableName: String, columnName: String, isNullable: Boolean): String
  36. def getUpdateColumnTypeQuery(tableName: String, columnName: String, newDataType: String): String
  37. def getYearMonthIntervalAsMonths(yearmonthStr: String): Int

    Converts an year-month interval string to an int value months.

    Converts an year-month interval string to an int value months.

    yearmonthStr

    the year-month interval string

    returns

    the number of total months in the interval

    Annotations
    @Since("4.0.0")
    Exceptions thrown

    IllegalArgumentException if the input string is invalid

  38. def indexExists(conn: Connection, indexName: String, tableIdent: Identifier, options: JDBCOptions): Boolean

    Checks whether an index exists

    Checks whether an index exists

    indexName

    the name of the index

    tableIdent

    the table on which index to be checked

    options

    JDBCOptions of the table

    returns

    true if the index with indexName exists in the table with tableName, false otherwise

  39. def insertIntoTable(table: String, fields: Array[StructField]): String

    Returns an Insert SQL statement template for inserting a row into the target table via JDBC conn.

    Returns an Insert SQL statement template for inserting a row into the target table via JDBC conn. Use "?" as placeholder for each value to be inserted. E.g. INSERT INTO t ("name", "age", "gender") VALUES (?, ?, ?)

    table

    The name of the table.

    fields

    The fields of the row that will be inserted.

    returns

    The SQL query to use for insert data into table.

    Annotations
    @Since("4.0.0")
  40. def isCascadingTruncateTable(): Option[Boolean]

    Return Some[true] iff TRUNCATE TABLE causes cascading default.

    Return Some[true] iff TRUNCATE TABLE causes cascading default. Some[true] : TRUNCATE TABLE causes cascading. Some[false] : TRUNCATE TABLE does not cause cascading. None: The behavior of TRUNCATE TABLE is unknown (default).

  41. def isSupportedFunction(funcName: String): Boolean

    Returns whether the database supports function.

    Returns whether the database supports function.

    funcName

    Upper-cased function name

    returns

    True if the database supports function.

    Annotations
    @Since("3.3.0")
  42. def listIndexes(conn: Connection, tableIdent: Identifier, options: JDBCOptions): Array[TableIndex]

    Lists all the indexes in this table.

  43. def listSchemas(conn: Connection, options: JDBCOptions): Array[Array[String]]

    Lists all the schemas in this table.

  44. def quoteIdentifier(colName: String): String

    Quotes the identifier.

    Quotes the identifier. This is used to put quotes around the identifier in case the column name is a reserved keyword, or in case it contains characters that require quotes (e.g. space).

  45. def removeSchemaCommentQuery(schema: String): String
  46. def renameTable(oldTable: Identifier, newTable: Identifier): String

    Rename an existing table.

    Rename an existing table.

    oldTable

    The existing table.

    newTable

    New name of the table.

    returns

    The SQL statement to use for renaming the table.

    Annotations
    @Since("3.5.0")
  47. def schemasExists(conn: Connection, options: JDBCOptions, schema: String): Boolean

    Check schema exists or not.

  48. def supportsLimit: Boolean

    Returns ture if dialect supports LIMIT clause.

    Returns ture if dialect supports LIMIT clause.

    Note: Some build-in dialect supports LIMIT clause with some trick, please see: OracleDialect.OracleSQLQueryBuilder and MsSqlServerDialect.MsSqlServerSQLQueryBuilder.

  49. def supportsOffset: Boolean

    Returns ture if dialect supports OFFSET clause.

    Returns ture if dialect supports OFFSET clause.

    Note: Some build-in dialect supports OFFSET clause with some trick, please see: OracleDialect.OracleSQLQueryBuilder and MySQLDialect.MySQLSQLQueryBuilder.

  50. def supportsTableSample: Boolean
  51. def updateExtraColumnMeta(conn: Connection, rsmd: ResultSetMetaData, columnIdx: Int, metadata: MetadataBuilder): Unit

    Get extra column metadata for the given column.

    Get extra column metadata for the given column.

    conn

    The connection currently connection being used.

    rsmd

    The metadata of the current result set.

    columnIdx

    The index of the column.

    metadata

    The metadata builder to store the extra column information.

    Annotations
    @Since("4.0.0")

Deprecated Value Members

  1. def classifyException(message: String, e: Throwable): AnalysisException

    Gets a dialect exception, classifies it and wraps it by AnalysisException.

    Gets a dialect exception, classifies it and wraps it by AnalysisException.

    message

    The error message to be placed to the returned exception.

    e

    The dialect specific exception.

    returns

    AnalysisException or its sub-class.

    Annotations
    @deprecated
    Deprecated

    (Since version 4.0.0) Please override the classifyException method with an error class

  2. def compileAggregate(aggFunction: AggregateFunc): Option[String]

    Converts aggregate function to String representing a SQL expression.

    Converts aggregate function to String representing a SQL expression.

    aggFunction

    The aggregate function to be converted.

    returns

    Converted value.

    Annotations
    @Since("3.3.0") @deprecated
    Deprecated

    (Since version 3.4.0) use org.apache.spark.sql.jdbc.JdbcDialect.compileExpression instead.

  3. def renameTable(oldTable: String, newTable: String): String

    Rename an existing table.

    Rename an existing table.

    oldTable

    The existing table.

    newTable

    New name of the table.

    returns

    The SQL statement to use for renaming the table.

    Annotations
    @deprecated
    Deprecated

    (Since version 3.5.0) Please override renameTable method with identifiers