All Superinterfaces:: Scan

All Known Subinterfaces:: SupportsRuntimeFiltering

@Experimental public interface SupportsRuntimeV2Filtering extends Scan

A mix-in interface for Scan. Data sources can implement this interface if they can filter initially planned InputPartitions using predicates Spark infers at runtime. This interface is very similar to SupportsRuntimeFiltering except it uses data source V2 Predicate instead of data source V1 Filter. SupportsRuntimeV2Filtering is preferred over SupportsRuntimeFiltering and only one of them should be implemented by the data sources.

Note that Spark will push runtime filters only if they are beneficial.

Since:: 3.4.0

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.spark.sql.connector.read.Scan
Scan.ColumnarSupportMode
Method Summary

Modifier and Type

Method

Description

void

filter(Predicate[] predicates)

Filters this scan using runtime predicates.

NamedReference[]

filterAttributes()

Returns attributes this scan can be filtered by at runtime.

Methods inherited from interface org.apache.spark.sql.connector.read.Scan
columnarSupportMode, description, readSchema, reportDriverMetrics, supportedCustomMetrics, toBatch, toContinuousStream, toMicroBatchStream

Method Details
- filterAttributes
  
  NamedReference[] filterAttributes()
  
  Returns attributes this scan can be filtered by at runtime.
  Spark will call filter(Predicate[]) if it can derive a runtime predicate for any of the filter attributes.
- filter
  
  void filter(Predicate[] predicates)
  
  Filters this scan using runtime predicates.
  The provided expressions must be interpreted as a set of predicates that are ANDed together. Implementations may use the predicates to prune initially planned InputPartitions.
  If the scan also implements SupportsReportPartitioning, it must preserve the originally reported partitioning during runtime filtering. While applying runtime predicates, the scan may detect that some InputPartitions have no matching data. It can omit such partitions entirely only if it does not report a specific partitioning. Otherwise, the scan can replace the initially planned InputPartitions that have no matching data with empty InputPartitions but must preserve the overall number of partitions.
  Note that Spark will call Scan.toBatch() again after filtering the scan at runtime.
  
  Parameters:
  
  predicates - data source V2 predicates used to filter the scan at runtime

Interface SupportsRuntimeV2Filtering

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.spark.sql.connector.read.Scan

Method Summary

Methods inherited from interface org.apache.spark.sql.connector.read.Scan

Method Details

filterAttributes

filter