Package org.apache.spark.sql.pipelines.graph
package org.apache.spark.sql.pipelines.graph
-
ClassDescriptionUsed in full graph update to select all flows.Used in full graph updates to select all tables.A
Flow
that reads source[s] completely and appends data to the target, just once.A `FlowExecution` that writes a batch `DataFrame` to a `Table`.Raised when there's a circular dependency in the current pipeline.AFlow
that declares exactly what data should be in the target table.Processor that is responsible for analyzing each flow and sort the nodes in topological orderDataflowGraph represents the core graph structure for Spark declarative pipelines.Resolves theDataflowGraph
by processing each node in the graph.Exception thrown when transforming a node in the graph fails with a non-retryable error.Exception thrown when transforming a node in the graph fails because at least one of its dependencies weren't yet transformed.DatasetManager
is responsible for materializing tables in the catalog based on the given graph.Wraps table materialization exceptions.A flow's execution may complete for two reasons: 1.Indicates that there was a failure while stopping the flow.Abstract class used to identify failures related to failures stopping an operation/timeouts.AFlow
is a node of data transformation in a dataflow graph.A `FlowExecution` specifies how to execute a flow and manages its execution.Specifies how we should filter Flows.A wrapper for the lambda function that defines aFlow
.Holds the DataFrame returned by aFlowFunction
along with the inputs used to construct it.param: identifier The identifier of the flow.Plans execution ofFlow
s in aDataflowGraph
by convertingFlow
s into 'FlowExecution's.Used in partial graph updates to select flows that flow to "selectedTables".An element in aDataflowGraph
.Collection of errors that can be thrown during graph resolution / analysis.Represents the reason why a flow execution should be stopped.Indicates that the flow execution should be retried.Indicates that the flow execution should be stopped with a specific reason.GraphFilter<E>Specifies how we should filter Graph elements.Responsible for properly qualify the identifiers for datasets inside or referenced by the dataflow graph.Represents the identifier for a dataset that is defined or referenced in a pipeline.Represents the identifier for a dataset that is external to the current pipeline.Represents the identifier for a dataset that is defined by the current pipeline.A mutable context for registering tables, views, and flows in a dataflow graph.Validations performed on a `DataflowGraph`.Specifies an input that can be referenced by another Dataset's query.Exception raised when a flow fails to read from a table defined within the pipelineUsed to specify that no flows should be refreshed.Used to select no tables.Represents a node in aDataflowGraph
that can be written to by aFlow
.Representing a persistedView
in aDataflowGraph
.Executes aDataflowGraph
by resolving the graph, materializing datasets, and running the flows.Interface for validating and accessing Pipeline-specific table properties.An implementation of the PipelineUpdateContext trait used in production.Contains the catalog and database context information for query execution.Indicates that run has failed due to a query execution failure.Records information used to track the provenance of a given query to user code.AFlow
whose flow function has been invoked, meaning either: - Its output schema and dependencies are known.AFlow
whose flow function has failed to resolve.AFlow
whose flow function has successfully resolved.A wrapper for a resolved internal input that includes the alias provided by the user.Indicates that a triggered run has successfully completed execution.Indicates that an run entered the failed state..Helper exception class that indicates that a run has to be terminated and tracks the associated termination reason.Used in partial graph updates to select "selectedTables".SQL statement processor context.Class that holds the logical plan and query origin parsed from a SQL statement.Data class for all state that is accumulated while processing a particularSqlGraphRegistrationContext
.AFlow
that represents stateful movement of data to some target.A 'FlowExecution' that processes data statefully using Structured Streaming.A `StreamingFlowExecution` that writes a streaming `DataFrame` to a `Table`.A table representing a materialized dataset in aDataflowGraph
.Specifies how we should filter Tables.A type ofInput
where data is loaded from a table.Representing a temporaryView
in aDataflowGraph
.Executes all of the flows in the given graph in topological order.Uncaught exception handler which first calls the delegate and then calls the OnFailure function with the uncaught exception.Run could not be associated with a proper root cause.Returns a flow filter that is a union of two flow filtersException raised when a flow tries to read from a dataset that exists but is unresolvedAFlow
whose output schema and dependencies aren't known.Exception raised when a pipeline has one or more flows that cannot be resolvedRepresenting a view in theDataflowGraph
.A type ofTableInput
that returns data from a specified schema or from the inferredFlow
s that write to the table.