PartitionReaderFactory (Spark 3.5.2 JavaDoc)

Skip navigation links

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method

All Superinterfaces:

java.io.Serializable

All Known Subinterfaces:

ContinuousPartitionReaderFactory
```
@Evolving
public interface PartitionReaderFactory
extends java.io.Serializable
```
A factory used to create PartitionReader instances.
If Spark fails to execute any methods in the implementations of this interface or in the returned PartitionReader (by throwing an exception), corresponding Spark task would fail and get retried until hitting the maximum retry times.

Since:

3.0.0

Method Summary

All Methods Instance Methods Abstract Methods Default Methods
Modifier and Type	Method and Description
`default PartitionReader<ColumnarBatch>`	`createColumnarReader(InputPartition partition)` Returns a columnar partition reader to read data from the given `InputPartition`.
`PartitionReader<org.apache.spark.sql.catalyst.InternalRow>`	`createReader(InputPartition partition)` Returns a row-based partition reader to read data from the given `InputPartition`.
`default boolean`	`supportColumnarReads(InputPartition partition)` Returns true if the given `InputPartition` should be read by Spark in a columnar way.

- Method Detail
  - createReader
```
PartitionReader<org.apache.spark.sql.catalyst.InternalRow> createReader(InputPartition partition)
```
    Returns a row-based partition reader to read data from the given InputPartition.
    Implementations probably need to cast the input partition to the concrete InputPartition class defined for the data source.
  - createColumnarReader
```
default PartitionReader<ColumnarBatch> createColumnarReader(InputPartition partition)
```
    Returns a columnar partition reader to read data from the given InputPartition.
    Implementations probably need to cast the input partition to the concrete InputPartition class defined for the data source.
  - supportColumnarReads
```
default boolean supportColumnarReads(InputPartition partition)
```
    Returns true if the given InputPartition should be read by Spark in a columnar way. This means, implementations must also implement createColumnarReader(InputPartition) for the input partitions that this method returns true.
    As of Spark 2.4, Spark can only read all input partition in a columnar way, or none of them. Data source can't mix columnar and row-based partitions. This may be relaxed in future versions.

Skip navigation links

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method