org.apache.spark.sql.sources
Class OutputWriterFactory
Object
org.apache.spark.sql.sources.OutputWriterFactory
- All Implemented Interfaces:
- java.io.Serializable
public abstract class OutputWriterFactory
- extends Object
- implements scala.Serializable
::Experimental::
A factory that produces OutputWriter
s. A new OutputWriterFactory
is created on driver
side for each write job issued when writing to a HadoopFsRelation
, and then gets serialized
to executor side to create actual OutputWriter
s on the fly.
- Since:
- 1.4.0
- See Also:
- Serialized Form
Methods inherited from class Object |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
OutputWriterFactory
public OutputWriterFactory()
newInstance
public abstract OutputWriter newInstance(String path,
StructType dataSchema,
org.apache.hadoop.mapreduce.TaskAttemptContext context)
- When writing to a
HadoopFsRelation
, this method gets called by each task on executor side
to instantiate new OutputWriter
s.
- Parameters:
path
- Path of the file to which this OutputWriter
is supposed to write. Note that
this may not point to the final output file. For example, FileOutputFormat
writes to
temporary directories and then merge written files back to the final destination. In
this case, path
points to a temporary output file under the temporary directory.dataSchema
- Schema of the rows to be written. Partition columns are not included in the
schema if the relation being written is partitioned.context
- The Hadoop MapReduce task context.
- Returns:
- (undocumented)
- Since:
- 1.4.0