Class DataflowGraphTransformer
Object
org.apache.spark.sql.pipelines.graph.DataflowGraphTransformer
- All Implemented Interfaces:
AutoCloseable
Resolves the
DataflowGraph
by processing each node in the graph. This class exposes visitor
functionality to resolve/analyze graph nodes.
We only expose simple visitor abilities to transform different entities of the
graph.
For advanced transformations we also expose a mechanism to walk the graph over entity by entity.
Assumptions: 1. Each output will have at-least 1 flow to it. 2. Each flow may or may not have a destination table. If a flow does not have a destination table, the destination is a temporary view.
The way graph is structured is that flows, tables and sinks all are graph elements or nodes. While we expose transformation functions for each of these entities, we also expose a way to process to walk over the graph.
Constructor is private as all usages should be via DataflowGraphTransformer.withDataflowGraphTransformer. param: graph: Any Dataflow Graph
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic class
Exception thrown when transforming a node in the graph fails with a non-retryable error.static class
static class
Exception thrown when transforming a node in the graph fails because at least one of its dependencies weren't yet transformed.static class
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
close()
transformDownNodes
(scala.Function2<GraphElement, scala.collection.immutable.Seq<GraphElement>, scala.collection.immutable.Seq<GraphElement>> transformer, boolean disableParallelism) Example graph: [Flow1, Flow 2] -> ST -> Flow3 -> MV Order of processing: Flow1, Flow2, ST, Flow3, MV.transformTables
(scala.Function1<Table, Table> transformer) static <T> T
withDataflowGraphTransformer
(DataflowGraph graph, scala.Function1<DataflowGraphTransformer, T> f) Autocloseable wrapper around DataflowGraphTransformer to ensure that the transformer is closed without clients needing to remember to close it.
-
Constructor Details
-
DataflowGraphTransformer
-
-
Method Details
-
withDataflowGraphTransformer
public static <T> T withDataflowGraphTransformer(DataflowGraph graph, scala.Function1<DataflowGraphTransformer, T> f) Autocloseable wrapper around DataflowGraphTransformer to ensure that the transformer is closed without clients needing to remember to close it. It takes in the same arguments asDataflowGraphTransformer
constructor. It exposes the DataflowGraphTransformer instance within the callable scope.- Parameters:
graph
- (undocumented)f
- (undocumented)- Returns:
- (undocumented)
-
transformTables
-
transformDownNodes
public DataflowGraphTransformer transformDownNodes(scala.Function2<GraphElement, scala.collection.immutable.Seq<GraphElement>, scala.collection.immutable.Seq<GraphElement>> transformer, boolean disableParallelism) Example graph: [Flow1, Flow 2] -> ST -> Flow3 -> MV Order of processing: Flow1, Flow2, ST, Flow3, MV.- Parameters:
transformer
- function that transforms any graph entity. transformer( nodeToTransform: GraphElement, upstreamNodes: Seq[GraphElement] ) => transformedNodes: Seq[GraphElement]disableParallelism
- (undocumented)- Returns:
- this
-
getDataflowGraph
-
close
public void close()- Specified by:
close
in interfaceAutoCloseable
-