OutputWriterFactory (Spark 1.4.1 JavaDoc)

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.spark.sql.sources
Class OutputWriterFactory

Object
  org.apache.spark.sql.sources.OutputWriterFactory

All Implemented Interfaces:: java.io.Serializable

public abstract class OutputWriterFactory
extends Object
implements scala.Serializable
extends Object
implements scala.Serializable

::Experimental:: A factory that produces OutputWriters. A new OutputWriterFactory is created on driver side for each write job issued when writing to a HadoopFsRelation, and then gets serialized to executor side to create actual OutputWriters on the fly.

Since:: 1.4.0
See Also:: Serialized Form

Constructor Summary
`OutputWriterFactory()`

Method Summary
`abstract OutputWriter`	`newInstance(String path, StructType dataSchema, org.apache.hadoop.mapreduce.TaskAttemptContext context)` When writing to a `HadoopFsRelation`, this method gets called by each task on executor side to instantiate new `OutputWriter`s.

Methods inherited from class Object
`equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail

OutputWriterFactory

public OutputWriterFactory()

Method Detail

newInstance

public abstract OutputWriter newInstance(String path,
                                         StructType dataSchema,
                                         org.apache.hadoop.mapreduce.TaskAttemptContext context)

When writing to a HadoopFsRelation, this method gets called by each task on executor side to instantiate new OutputWriters.

Parameters:: path - Path of the file to which this OutputWriter is supposed to write. Note that this may not point to the final output file. For example, FileOutputFormat writes to temporary directories and then merge written files back to the final destination. In this case, path points to a temporary output file under the temporary directory.; dataSchema - Schema of the rows to be written. Partition columns are not included in the schema if the relation being written is partitioned.; context - The Hadoop MapReduce task context.
Returns:: (undocumented)
Since:: 1.4.0