org.apache.spark

sql

package sql

Allows the execution of relational queries, including those expressed in SQL using Spark.

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. sql
  2. AnyRef
  3. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Type Members

  1. type ArrayType = sql.catalyst.types.ArrayType

    :: DeveloperApi ::

    :: DeveloperApi ::

    The data type for collections of multiple values. Internally these are represented as columns that contain a scala.collection.Seq.

    An ArrayType object comprises two fields, elementType: DataType and containsNull: Boolean. The field of elementType is used to specify the type of array elements. The field of containsNull is used to specify if the array has null values.

    Annotations
    @DeveloperApi()
  2. type DataType = sql.catalyst.types.DataType

    :: DeveloperApi ::

    :: DeveloperApi ::

    The base type of all Spark SQL data types.

    Annotations
    @DeveloperApi()
  3. type DecimalType = sql.catalyst.types.DecimalType

    :: DeveloperApi ::

    :: DeveloperApi ::

    The data type representing scala.math.BigDecimal values.

    TODO(matei): explain precision and scale

    Annotations
    @DeveloperApi()
  4. type MapType = sql.catalyst.types.MapType

    :: DeveloperApi ::

    :: DeveloperApi ::

    The data type representing Maps. A MapType object comprises three fields, keyType: DataType, valueType: DataType and valueContainsNull: Boolean. The field of keyType is used to specify the type of keys in the map. The field of valueType is used to specify the type of values in the map. The field of valueContainsNull is used to specify if values of this map has null values. For values of a MapType column, keys are not allowed to have null values.

    Annotations
    @DeveloperApi()
  5. type Metadata = sql.catalyst.util.Metadata

    :: DeveloperApi ::

    :: DeveloperApi ::

    Metadata is a wrapper over Map[String, Any] that limits the value type to simple ones: Boolean, Long, Double, String, Metadata, Array[Boolean], Array[Long], Array[Double], Array[String], and Array[Metadata]. JSON is used for serialization.

    The default constructor is private. User should use either MetadataBuilder or Metadata$#fromJson to create Metadata instances.

    Annotations
    @DeveloperApi()
  6. type MetadataBuilder = sql.catalyst.util.MetadataBuilder

    :: DeveloperApi :: Builder for Metadata.

    :: DeveloperApi :: Builder for Metadata. If there is a key collision, the latter will overwrite the former.

    Annotations
    @DeveloperApi()
  7. type Row = sql.catalyst.expressions.Row

    :: DeveloperApi ::

    :: DeveloperApi ::

    Represents one row of output from a relational operator.

    Annotations
    @DeveloperApi()
  8. class SQLContext extends Logging with SQLConf with CacheManager with ExpressionConversions with UDFRegistration with Serializable

    :: AlphaComponent :: The entry point for running relational queries using Spark.

  9. class SchemaRDD extends RDD[Row] with SchemaRDDLike

    :: AlphaComponent :: An RDD of Row objects that has an associated schema.

  10. type Strategy = GenericStrategy[SparkPlan]

    Converts a logical plan into zero or more SparkPlans.

    Converts a logical plan into zero or more SparkPlans.

    Annotations
    @DeveloperApi()
  11. type StructField = sql.catalyst.types.StructField

    :: DeveloperApi ::

    :: DeveloperApi ::

    A StructField object represents a field in a StructType object. A StructField object comprises three fields, name: String, dataType: DataType, and nullable: Boolean. The field of name is the name of a StructField. The field of dataType specifies the data type of a StructField. The field of nullable specifies if values of a StructField can contain null values.

    Annotations
    @DeveloperApi()
  12. type StructType = sql.catalyst.types.StructType

    :: DeveloperApi ::

    :: DeveloperApi ::

    The data type representing Rows. A StructType object comprises a Seq of StructFields.

    Annotations
    @DeveloperApi()

Value Members

  1. val ArrayType: sql.catalyst.types.ArrayType.type

    :: DeveloperApi ::

    :: DeveloperApi ::

    An ArrayType object can be constructed with two ways,

    ArrayType(elementType: DataType, containsNull: Boolean)

    and

    ArrayType(elementType: DataType)

    For ArrayType(elementType), the field of containsNull is set to false.

  2. val BinaryType: sql.catalyst.types.BinaryType.type

    :: DeveloperApi ::

    :: DeveloperApi ::

    The data type representing Array[Byte] values.

  3. val BooleanType: sql.catalyst.types.BooleanType.type

    :: DeveloperApi ::

    :: DeveloperApi ::

    The data type representing Boolean values.

  4. val ByteType: sql.catalyst.types.ByteType.type

    :: DeveloperApi ::

    :: DeveloperApi ::

    The data type representing Byte values.

  5. val DataType: sql.catalyst.types.DataType.type

  6. val DateType: sql.catalyst.types.DateType.type

    :: DeveloperApi ::

    :: DeveloperApi ::

    The data type representing java.sql.Date values.

  7. val DecimalType: sql.catalyst.types.DecimalType.type

    :: DeveloperApi ::

    :: DeveloperApi ::

    The data type representing scala.math.BigDecimal values.

    TODO(matei): explain precision and scale

  8. val DoubleType: sql.catalyst.types.DoubleType.type

    :: DeveloperApi ::

    :: DeveloperApi ::

    The data type representing Double values.

  9. val FloatType: sql.catalyst.types.FloatType.type

    :: DeveloperApi ::

    :: DeveloperApi ::

    The data type representing Float values.

  10. val IntegerType: sql.catalyst.types.IntegerType.type

    :: DeveloperApi ::

    :: DeveloperApi ::

    The data type representing Int values.

  11. val LongType: sql.catalyst.types.LongType.type

    :: DeveloperApi ::

    :: DeveloperApi ::

    The data type representing Long values.

  12. val MapType: sql.catalyst.types.MapType.type

    :: DeveloperApi ::

    :: DeveloperApi ::

    A MapType object can be constructed with two ways,

    MapType(keyType: DataType, valueType: DataType, valueContainsNull: Boolean)

    and

    MapType(keyType: DataType, valueType: DataType)

    For MapType(keyType: DataType, valueType: DataType), the field of valueContainsNull is set to true.

  13. val NullType: sql.catalyst.types.NullType.type

    :: DeveloperApi ::

    :: DeveloperApi ::

    The data type representing NULL values.

  14. val Row: sql.catalyst.expressions.Row.type

    :: DeveloperApi ::

    :: DeveloperApi ::

    A Row object can be constructed by providing field values. Example:

    import org.apache.spark.sql._
    
    // Create a Row from values.
    Row(value1, value2, value3, ...)
    // Create a Row from a Seq of values.
    Row.fromSeq(Seq(value1, value2, ...))

    A value of a row can be accessed through both generic access by ordinal, which will incur boxing overhead for primitives, as well as native primitive access. An example of generic access by ordinal:

    import org.apache.spark.sql._
    
    val row = Row(1, true, "a string", null)
    // row: Row = [1,true,a string,null]
    val firstValue = row(0)
    // firstValue: Any = 1
    val fourthValue = row(3)
    // fourthValue: Any = null

    For native primitive access, it is invalid to use the native primitive interface to retrieve a value that is null, instead a user must check isNullAt before attempting to retrieve a value that might be null. An example of native primitive access:

    // using the row from the previous example.
    val firstValue = row.getInt(0)
    // firstValue: Int = 1
    val isNull = row.isNullAt(3)
    // isNull: Boolean = true

    Interfaces related to native primitive access are:

    isNullAt(i: Int): Boolean

    getInt(i: Int): Int

    getLong(i: Int): Long

    getDouble(i: Int): Double

    getFloat(i: Int): Float

    getBoolean(i: Int): Boolean

    getShort(i: Int): Short

    getByte(i: Int): Byte

    getString(i: Int): String

    Fields in a Row object can be extracted in a pattern match. Example:

    import org.apache.spark.sql._
    
    val pairs = sql("SELECT key, value FROM src").rdd.map {
      case Row(key: Int, value: String) =>
        key -> value
    }
  15. val ShortType: sql.catalyst.types.ShortType.type

    :: DeveloperApi ::

    :: DeveloperApi ::

    The data type representing Short values.

  16. val StringType: sql.catalyst.types.StringType.type

    :: DeveloperApi ::

    :: DeveloperApi ::

    The data type representing String values

  17. val StructField: sql.catalyst.types.StructField.type

    :: DeveloperApi ::

    :: DeveloperApi ::

    A StructField object can be constructed by

    StructField(name: String, dataType: DataType, nullable: Boolean)
  18. val StructType: sql.catalyst.types.StructType.type

    :: DeveloperApi ::

    :: DeveloperApi ::

    A StructType object can be constructed by

    StructType(fields: Seq[StructField])

    For a StructType object, one or multiple StructFields can be extracted by names. If multiple StructFields are extracted, a StructType object will be returned. If a provided name does not have a matching field, it will be ignored. For the case of extracting a single StructField, a null will be returned. Example:

    import org.apache.spark.sql._
    
    val struct =
      StructType(
        StructField("a", IntegerType, true) ::
        StructField("b", LongType, false) ::
        StructField("c", BooleanType, false) :: Nil)
    
    // Extract a single StructField.
    val singleField = struct("b")
    // singleField: StructField = StructField(b,LongType,false)
    
    // This struct does not have a field called "d". null will be returned.
    val nonExisting = struct("d")
    // nonExisting: StructField = null
    
    // Extract multiple StructFields. Field names are provided in a set.
    // A StructType object will be returned.
    val twoFields = struct(Set("b", "c"))
    // twoFields: StructType =
    //   StructType(List(StructField(b,LongType,false), StructField(c,BooleanType,false)))
    
    // Those names do not have matching fields will be ignored.
    // For the case shown below, "d" will be ignored and
    // it is treated as struct(Set("b", "c")).
    val ignoreNonExisting = struct(Set("b", "c", "d"))
    // ignoreNonExisting: StructType =
    //   StructType(List(StructField(b,LongType,false), StructField(c,BooleanType,false)))

    A Row object is used as a value of the StructType. Example:

    import org.apache.spark.sql._
    
    val innerStruct =
      StructType(
        StructField("f1", IntegerType, true) ::
        StructField("f2", LongType, false) ::
        StructField("f3", BooleanType, false) :: Nil)
    
    val struct = StructType(
      StructField("a", innerStruct, true) :: Nil)
    
    // Create a Row with the schema defined by struct
    val row = Row(Row(1, 2, true))
    // row: Row = [[1,2,true]]
  19. val TimestampType: sql.catalyst.types.TimestampType.type

    :: DeveloperApi ::

    :: DeveloperApi ::

    The data type representing java.sql.Timestamp values.

  20. package api

  21. package execution

    :: DeveloperApi :: An execution engine for relational query plans that runs on top Spark and returns RDDs.

  22. package hive

  23. package parquet

  24. package sources

    A set of APIs for adding data sources to Spark SQL.

  25. package test

  26. package types

Inherited from AnyRef

Inherited from Any

Data types

Field

Row

Ungrouped