@InterfaceStability.Evolving
public interface DataReaderFactory<T>
extends java.io.Serializable
DataSourceReader.createDataReaderFactories()
and is
responsible for creating the actual data reader. The relationship between
DataReaderFactory
and DataReader
is similar to the relationship between Iterable
and Iterator
.
Note that, the reader factory will be serialized and sent to executors, then the data reader
will be created on executors and do the actual reading. So DataReaderFactory
must be
serializable and DataReader
doesn't need to be.Modifier and Type | Method and Description |
---|---|
DataReader<T> |
createDataReader()
Returns a data reader to do the actual reading work.
|
default String[] |
preferredLocations()
The preferred locations where the data reader returned by this reader factory can run faster,
but Spark does not guarantee to run the data reader on these locations.
|
default String[] preferredLocations()
DataReader<T> createDataReader()