Package

org.apache.spark.sql.hive

execution

Permalink

package execution

Visibility
  1. Public
  2. All

Type Members

  1. case class CreateHiveTableAsSelectCommand(tableDesc: CatalogTable, query: LogicalPlan, mode: SaveMode) extends LeafNode with RunnableCommand with Product with Serializable

    Permalink

    Create table and insert the query result into it.

    Create table and insert the query result into it.

    tableDesc

    the Table Describe, which may contains serde, storage handler etc.

    query

    the query whose result will be insert into the new relation

    mode

    SaveMode

  2. class HiveFileFormat extends FileFormat with DataSourceRegister with Logging

    Permalink

    FileFormat for writing Hive tables.

    FileFormat for writing Hive tables.

    TODO: implement the read logic.

  3. class HiveOptions extends Serializable

    Permalink

    Options for the Hive data source.

    Options for the Hive data source. Note that rule DetermineHiveSerde will extract Hive serde/format information from these options.

  4. class HiveOutputWriter extends OutputWriter with HiveInspectors

    Permalink
  5. case class HiveScriptIOSchema(inputRowFormat: Seq[(String, String)], outputRowFormat: Seq[(String, String)], inputSerdeClass: Option[String], outputSerdeClass: Option[String], inputSerdeProps: Seq[(String, String)], outputSerdeProps: Seq[(String, String)], recordReaderClass: Option[String], recordWriterClass: Option[String], schemaLess: Boolean) extends HiveInspectors with Product with Serializable

    Permalink

    The wrapper class of Hive input and output schema properties

  6. case class InsertIntoHiveTable(table: CatalogTable, partition: Map[String, Option[String]], query: LogicalPlan, overwrite: Boolean, ifPartitionNotExists: Boolean) extends LeafNode with RunnableCommand with Product with Serializable

    Permalink

    Command for writing data out to a Hive table.

    Command for writing data out to a Hive table.

    This class is mostly a mess, for legacy reasons (since it evolved in organic ways and had to follow Hive's internal implementations closely, which itself was a mess too). Please don't blame Reynold for this! He was just moving code around!

    In the future we should converge the write path for Hive with the normal data source write path, as defined in org.apache.spark.sql.execution.datasources.FileFormatWriter.

    table

    the metadata of the table.

    partition

    a map from the partition key to the partition value (optional). If the partition value is optional, dynamic partition insert will be performed. As an example, INSERT INTO tbl PARTITION (a=1, b=2) AS ... would have

    Map('a' -> Some('1'), 'b' -> Some('2'))

    and INSERT INTO tbl PARTITION (a=1, b) AS ... would have

    Map('a' -> Some('1'), 'b' -> None)

    .

    query

    the logical plan representing data to write to.

    overwrite

    overwrite existing table or partitions.

    ifPartitionNotExists

    If true, only write if the partition does not exist. Only valid for static partitions.

  7. case class ScriptTransformationExec(input: Seq[Expression], script: String, output: Seq[Attribute], child: SparkPlan, ioschema: HiveScriptIOSchema) extends SparkPlan with UnaryExecNode with Product with Serializable

    Permalink

    Transforms the input by forking and running the specified script.

    Transforms the input by forking and running the specified script.

    input

    the set of expression that should be passed to the script.

    script

    the command that should be executed.

    output

    the attributes that are produced by the script.

Value Members

  1. object HiveOptions extends Serializable

    Permalink
  2. object HiveScriptIOSchema extends Serializable

    Permalink

Ungrouped