@Experimental public interface SupportsRuntimeFiltering extends Scan
Scan
. Data sources can implement this interface if they can
filter initially planned InputPartition
s using predicates Spark infers at runtime.
Note that Spark will push runtime filters only if they are beneficial.
Modifier and Type | Method and Description |
---|---|
void |
filter(Filter[] filters)
Filters this scan using runtime filters.
|
NamedReference[] |
filterAttributes()
Returns attributes this scan can be filtered by at runtime.
|
description, readSchema, supportedCustomMetrics, toBatch, toContinuousStream, toMicroBatchStream
NamedReference[] filterAttributes()
Spark will call filter(Filter[])
if it can derive a runtime
predicate for any of the filter attributes.
void filter(Filter[] filters)
The provided expressions must be interpreted as a set of filters that are ANDed together.
Implementations may use the filters to prune initially planned InputPartition
s.
If the scan also implements SupportsReportPartitioning
, it must preserve
the originally reported partitioning during runtime filtering. While applying runtime filters,
the scan may detect that some InputPartition
s have no matching data. It can omit
such partitions entirely only if it does not report a specific partitioning. Otherwise,
the scan can replace the initially planned InputPartition
s that have no matching
data with empty InputPartition
s but must preserve the overall number of partitions.
Note that Spark will call Scan.toBatch()
again after filtering the scan at runtime.
filters
- data source filters used to filter the scan at runtime