Class DatasetManager

Object
org.apache.spark.sql.pipelines.graph.DatasetManager

public class DatasetManager extends Object
DatasetManager is responsible for materializing tables in the catalog based on the given graph. For each table in the graph, it will create a table if none exists (or if this is a full refresh), or merge the schema of an existing table to match the new flows writing to it.
  • Constructor Details

    • DatasetManager

      public DatasetManager()
  • Method Details

    • materializeDatasets

      public static DataflowGraph materializeDatasets(DataflowGraph resolvedDataflowGraph, PipelineUpdateContext context)
      Materializes the tables in the given graph. This method will create or update the tables in the catalog based on the given graph and context.

      Parameters:
      resolvedDataflowGraph - The resolved DataflowGraph with resolved Flow sorted in topological order.
      context - The context for the pipeline update.
      Returns:
      The graph with materialized tables.
    • org$apache$spark$internal$Logging$$log_

      public static org.slf4j.Logger org$apache$spark$internal$Logging$$log_()
    • org$apache$spark$internal$Logging$$log__$eq

      public static void org$apache$spark$internal$Logging$$log__$eq(org.slf4j.Logger x$1)
    • LogStringContext

      public static org.apache.spark.internal.Logging.LogStringContext LogStringContext(scala.StringContext sc)