Package org.apache.spark.input
Class PortableDataStream
Object
org.apache.spark.input.PortableDataStream
- All Implemented Interfaces:
Serializable
,scala.Serializable
A class that allows DataStreams to be serialized and moved around by not creating them
until they need to be read
- See Also:
- Note:
- TaskAttemptContext is not serializable resulting in the confBytes construct, CombineFileSplit is not serializable resulting in the splitBytes construct
-
Constructor Summary
ConstructorDescriptionPortableDataStream
(org.apache.hadoop.mapreduce.lib.input.CombineFileSplit isplit, org.apache.hadoop.mapreduce.TaskAttemptContext context, Integer index) -
Method Summary
-
Constructor Details
-
PortableDataStream
public PortableDataStream(org.apache.hadoop.mapreduce.lib.input.CombineFileSplit isplit, org.apache.hadoop.mapreduce.TaskAttemptContext context, Integer index)
-
-
Method Details
-
getConfiguration
public org.apache.hadoop.conf.Configuration getConfiguration() -
getPath
-
open
Create a new DataInputStream from the split and context. The user of this method is responsible for closing the stream after usage.- Returns:
- (undocumented)
-
toArray
public byte[] toArray()Read the file as a byte array- Returns:
- (undocumented)
-