org.apache.spark.sql.parquet

InsertIntoParquetTable

case class InsertIntoParquetTable(relation: ParquetRelation, child: SparkPlan, overwrite: Boolean = false) extends SparkPlan with UnaryNode with SparkHadoopMapReduceUtil with Product with Serializable

:: DeveloperApi :: Operator that acts as a sink for queries on RDDs and can be used to store the output inside a directory of Parquet files. This operator is similar to Hive's INSERT INTO TABLE operation in the sense that one can choose to either overwrite or append to a directory. Note that consecutive insertions to the same table must have compatible (source) schemas.

WARNING: EXPERIMENTAL! InsertIntoParquetTable with overwrite=false may cause data corruption in the case that multiple users try to append to the same table simultaneously. Inserting into a table that was previously generated by other means (e.g., by creating an HDFS directory and importing Parquet files generated by other tools) may cause unpredicted behaviour and therefore results in a RuntimeException (only detected via filename pattern so will not catch all cases).

Annotations
@DeveloperApi()
Linear Supertypes
Product, Equals, SparkHadoopMapReduceUtil, UnaryNode, UnaryNode[SparkPlan], SparkPlan, Serializable, Serializable, Logging, QueryPlan[SparkPlan], TreeNode[SparkPlan], AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. InsertIntoParquetTable
  2. Product
  3. Equals
  4. SparkHadoopMapReduceUtil
  5. UnaryNode
  6. UnaryNode
  7. SparkPlan
  8. Serializable
  9. Serializable
  10. Logging
  11. QueryPlan
  12. TreeNode
  13. AnyRef
  14. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Instance Constructors

  1. new InsertIntoParquetTable(relation: ParquetRelation, child: SparkPlan, overwrite: Boolean = false)

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. def apply(number: Int): SparkPlan

    Definition Classes
    TreeNode
  7. def argString: String

    Definition Classes
    TreeNode
  8. def asCode: String

    Definition Classes
    TreeNode
  9. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  10. val child: SparkPlan

    Definition Classes
    InsertIntoParquetTable → UnaryNode
  11. def children: List[SparkPlan]

    Definition Classes
    UnaryNode
  12. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  13. val codegenEnabled: Boolean

    Definition Classes
    SparkPlan
  14. def collect[B](pf: PartialFunction[SparkPlan, B]): Seq[B]

    Definition Classes
    TreeNode
  15. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  16. def execute(): RDD[catalyst.expressions.Row]

    Inserts all rows into the Parquet file.

    Inserts all rows into the Parquet file.

    Definition Classes
    InsertIntoParquetTableSparkPlan
  17. def executeCollect(): Array[catalyst.expressions.Row]

    Runs this query returning the result as an array.

    Runs this query returning the result as an array.

    Definition Classes
    SparkPlan
  18. def expressions: Seq[Expression]

    Definition Classes
    QueryPlan
  19. def fastEquals(other: TreeNode[_]): Boolean

    Definition Classes
    TreeNode
  20. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  21. def flatMap[A](f: (SparkPlan) ⇒ TraversableOnce[A]): Seq[A]

    Definition Classes
    TreeNode
  22. def foreach(f: (SparkPlan) ⇒ Unit): Unit

    Definition Classes
    TreeNode
  23. def generateTreeString(depth: Int, builder: StringBuilder): StringBuilder

    Attributes
    protected
    Definition Classes
    TreeNode
  24. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  25. def getNodeNumbered(number: MutableInt): SparkPlan

    Attributes
    protected
    Definition Classes
    TreeNode
  26. def inputSet: AttributeSet

    Definition Classes
    QueryPlan
  27. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  28. def isTraceEnabled(): Boolean

    Attributes
    protected
    Definition Classes
    Logging
  29. def log: Logger

    Attributes
    protected
    Definition Classes
    Logging
  30. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  31. def logDebug(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  32. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  33. def logError(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  34. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  35. def logInfo(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  36. def logName: String

    Attributes
    protected
    Definition Classes
    Logging
  37. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  38. def logTrace(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  39. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  40. def logWarning(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  41. def makeCopy(newArgs: Array[AnyRef]): InsertIntoParquetTable.this.type

    Overridden make copy also propogates sqlContext to copied plan.

    Overridden make copy also propogates sqlContext to copied plan.

    Definition Classes
    SparkPlan → TreeNode
  42. def map[A](f: (SparkPlan) ⇒ A): Seq[A]

    Definition Classes
    TreeNode
  43. def mapChildren(f: (SparkPlan) ⇒ SparkPlan): InsertIntoParquetTable.this.type

    Definition Classes
    TreeNode
  44. def missingInput: AttributeSet

    Definition Classes
    QueryPlan
  45. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  46. def newJobContext(conf: Configuration, jobId: JobID): JobContext

    Definition Classes
    SparkHadoopMapReduceUtil
  47. def newMutableProjection(expressions: Seq[Expression], inputSchema: Seq[Attribute]): () ⇒ MutableProjection

    Attributes
    protected
    Definition Classes
    SparkPlan
  48. def newOrdering(order: Seq[SortOrder], inputSchema: Seq[Attribute]): Ordering[catalyst.expressions.Row]

    Attributes
    protected
    Definition Classes
    SparkPlan
  49. def newPredicate(expression: Expression, inputSchema: Seq[Attribute]): (catalyst.expressions.Row) ⇒ Boolean

    Attributes
    protected
    Definition Classes
    SparkPlan
  50. def newProjection(expressions: Seq[Expression], inputSchema: Seq[Attribute]): Projection

    Attributes
    protected
    Definition Classes
    SparkPlan
  51. def newTaskAttemptContext(conf: Configuration, attemptId: TaskAttemptID): TaskAttemptContext

    Definition Classes
    SparkHadoopMapReduceUtil
  52. def newTaskAttemptID(jtIdentifier: String, jobId: Int, isMap: Boolean, taskId: Int, attemptId: Int): TaskAttemptID

    Definition Classes
    SparkHadoopMapReduceUtil
  53. def nodeName: String

    Definition Classes
    TreeNode
  54. final def notify(): Unit

    Definition Classes
    AnyRef
  55. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  56. def numberedTreeString: String

    Definition Classes
    TreeNode
  57. def otherCopyArgs: Seq[AnyRef]

    Attributes
    protected
    Definition Classes
    TreeNode
  58. def output: Seq[Attribute]

    Definition Classes
    InsertIntoParquetTable → QueryPlan
  59. def outputPartitioning: Partitioning

    Specifies how data is partitioned across different nodes in the cluster.

    Specifies how data is partitioned across different nodes in the cluster.

    Definition Classes
    UnaryNode → SparkPlan
  60. def outputSet: AttributeSet

    Definition Classes
    QueryPlan
  61. val overwrite: Boolean

  62. def printSchema(): Unit

    Definition Classes
    QueryPlan
  63. def references: AttributeSet

    Definition Classes
    QueryPlan
  64. val relation: ParquetRelation

  65. def requiredChildDistribution: Seq[Distribution]

    Specifies any partition requirements on the input data for this operator.

    Specifies any partition requirements on the input data for this operator.

    Definition Classes
    SparkPlan
  66. def schema: catalyst.types.StructType

    Definition Classes
    QueryPlan
  67. def schemaString: String

    Definition Classes
    QueryPlan
  68. def simpleString: String

    Definition Classes
    QueryPlan → TreeNode
  69. def sparkContext: SparkContext

    Attributes
    protected
    Definition Classes
    SparkPlan
  70. val sqlContext: SQLContext

    A handle to the SQL Context that was used to create this plan.

    A handle to the SQL Context that was used to create this plan. Since many operators need access to the sqlContext for RDD operations or configuration this field is automatically populated by the query planning infrastructure.

    Attributes
    protected[org.apache.spark]
    Definition Classes
    SparkPlan
  71. def statePrefix: String

    Attributes
    protected
    Definition Classes
    QueryPlan
  72. def stringArgs: Iterator[Any]

    Attributes
    protected
    Definition Classes
    TreeNode
  73. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  74. def toString(): String

    Definition Classes
    TreeNode → AnyRef → Any
  75. def transform(rule: PartialFunction[SparkPlan, SparkPlan]): SparkPlan

    Definition Classes
    TreeNode
  76. def transformAllExpressions(rule: PartialFunction[Expression, Expression]): InsertIntoParquetTable.this.type

    Definition Classes
    QueryPlan
  77. def transformChildrenDown(rule: PartialFunction[SparkPlan, SparkPlan]): InsertIntoParquetTable.this.type

    Definition Classes
    TreeNode
  78. def transformChildrenUp(rule: PartialFunction[SparkPlan, SparkPlan]): InsertIntoParquetTable.this.type

    Definition Classes
    TreeNode
  79. def transformDown(rule: PartialFunction[SparkPlan, SparkPlan]): SparkPlan

    Definition Classes
    TreeNode
  80. def transformExpressions(rule: PartialFunction[Expression, Expression]): InsertIntoParquetTable.this.type

    Definition Classes
    QueryPlan
  81. def transformExpressionsDown(rule: PartialFunction[Expression, Expression]): InsertIntoParquetTable.this.type

    Definition Classes
    QueryPlan
  82. def transformExpressionsUp(rule: PartialFunction[Expression, Expression]): InsertIntoParquetTable.this.type

    Definition Classes
    QueryPlan
  83. def transformUp(rule: PartialFunction[SparkPlan, SparkPlan]): SparkPlan

    Definition Classes
    TreeNode
  84. def treeString: String

    Definition Classes
    TreeNode
  85. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  86. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  87. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  88. def withNewChildren(newChildren: Seq[SparkPlan]): InsertIntoParquetTable.this.type

    Definition Classes
    TreeNode

Inherited from Product

Inherited from Equals

Inherited from SparkHadoopMapReduceUtil

Inherited from UnaryNode

Inherited from UnaryNode[SparkPlan]

Inherited from SparkPlan

Inherited from Serializable

Inherited from Serializable

Inherited from Logging

Inherited from QueryPlan[SparkPlan]

Inherited from TreeNode[SparkPlan]

Inherited from AnyRef

Inherited from Any

Ungrouped