Interface GraphValidations

All Superinterfaces:
org.apache.spark.internal.Logging
All Known Implementing Classes:
DataflowGraph

public interface GraphValidations extends org.apache.spark.internal.Logging
Validations performed on a `DataflowGraph`.
  • Nested Class Summary

    Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging

    org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter
  • Method Summary

    Modifier and Type
    Method
    Description
    scala.Option<scala.Tuple2<org.apache.spark.sql.catalyst.TableIdentifier,org.apache.spark.sql.catalyst.TableIdentifier>>
    detectCycle(scala.collection.immutable.Map<org.apache.spark.sql.catalyst.TableIdentifier,scala.collection.immutable.Seq<org.apache.spark.sql.catalyst.TableIdentifier>> ancestors)
    Generic method to detect a cycle in directed graph via DFS traversal.
    void
    Validate that each resolved flow is correctly either a streaming flow or non-streaming flow, depending on the flow type (ex.
    void
    Throws an exception if the flows in this graph are not topologically sorted.
    scala.collection.immutable.Map<org.apache.spark.sql.catalyst.TableIdentifier,scala.collection.immutable.Seq<Flow>>
    Validate multi query table correctness.
    void
    Validates that persisted views don't read from invalid sources
    void
    Validates that all flows are resolved.
    void
    Validate that all tables are resettable.
    void
    validateTablesAreResettable(scala.collection.immutable.Seq<Table> tables)
    Validate that all specified tables are resettable.
    void
     

    Methods inherited from interface org.apache.spark.internal.Logging

    initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logBasedOnLevel, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext
  • Method Details

    • detectCycle

      scala.Option<scala.Tuple2<org.apache.spark.sql.catalyst.TableIdentifier,org.apache.spark.sql.catalyst.TableIdentifier>> detectCycle(scala.collection.immutable.Map<org.apache.spark.sql.catalyst.TableIdentifier,scala.collection.immutable.Seq<org.apache.spark.sql.catalyst.TableIdentifier>> ancestors)
      Generic method to detect a cycle in directed graph via DFS traversal. The graph is given as a reverse adjacency map, that is, a map from each node to its ancestors.
      Parameters:
      ancestors - (undocumented)
      Returns:
      the start and end node of a cycle if found, None otherwise
    • validateFlowStreamingness

      void validateFlowStreamingness()
      Validate that each resolved flow is correctly either a streaming flow or non-streaming flow, depending on the flow type (ex. once flow vs non-once flow) and the dataset type the flow writes to (ex. streaming table vs materialized view).
    • validateGraphIsTopologicallySorted

      void validateGraphIsTopologicallySorted()
      Throws an exception if the flows in this graph are not topologically sorted.
    • validateMultiQueryTables

      scala.collection.immutable.Map<org.apache.spark.sql.catalyst.TableIdentifier,scala.collection.immutable.Seq<Flow>> validateMultiQueryTables()
      Validate multi query table correctness.
      Returns:
      (undocumented)
    • validatePersistedViewSources

      void validatePersistedViewSources()
      Validates that persisted views don't read from invalid sources
    • validateSuccessfulFlowAnalysis

      void validateSuccessfulFlowAnalysis()
      Validates that all flows are resolved. If there are unresolved flows, detects a possible cyclic dependency and throw the appropriate exception.
    • validateTablesAreResettable

      void validateTablesAreResettable()
      Validate that all tables are resettable. This is a best-effort check that will only catch upstream tables that are resettable but have a non-resettable downstream dependency.
    • validateTablesAreResettable

      void validateTablesAreResettable(scala.collection.immutable.Seq<Table> tables)
      Validate that all specified tables are resettable.
    • validateUserSpecifiedSchemas

      void validateUserSpecifiedSchemas()