Interface StreamingQuery


public interface StreamingQuery
A handle to a query that is executing continuously in the background as new data arrives. All these methods are thread-safe.
Since:
2.0.0
  • Method Details

    • awaitTermination

      void awaitTermination() throws StreamingQueryException
      Waits for the termination of this query, either by query.stop() or by an exception. If the query has terminated with an exception, then the exception will be thrown.

      If the query has terminated, then all subsequent calls to this method will either return immediately (if the query was terminated by stop()), or throw the exception immediately (if the query has terminated with exception).

      Throws:
      StreamingQueryException - if the query has terminated with an exception.

      Since:
      2.0.0
    • awaitTermination

      boolean awaitTermination(long timeoutMs) throws StreamingQueryException
      Waits for the termination of this query, either by query.stop() or by an exception. If the query has terminated with an exception, then the exception will be thrown. Otherwise, it returns whether the query has terminated or not within the timeoutMs milliseconds.

      If the query has terminated, then all subsequent calls to this method will either return true immediately (if the query was terminated by stop()), or throw the exception immediately (if the query has terminated with exception).

      Parameters:
      timeoutMs - (undocumented)
      Returns:
      (undocumented)
      Throws:
      StreamingQueryException - if the query has terminated with an exception

      Since:
      2.0.0
    • exception

      scala.Option<StreamingQueryException> exception()
      Returns the StreamingQueryException if the query was terminated by an exception.
      Returns:
      (undocumented)
      Since:
      2.0.0
    • explain

      void explain()
      Prints the physical plan to the console for debugging purposes.
      Since:
      2.0.0
    • explain

      void explain(boolean extended)
      Prints the physical plan to the console for debugging purposes.

      Parameters:
      extended - whether to do extended explain or not
      Since:
      2.0.0
    • id

      UUID id()
      Returns the unique id of this query that persists across restarts from checkpoint data. That is, this id is generated when a query is started for the first time, and will be the same every time it is restarted from checkpoint data. Also see runId().

      Returns:
      (undocumented)
      Since:
      2.1.0
    • isActive

      boolean isActive()
      Returns true if this query is actively running.

      Returns:
      (undocumented)
      Since:
      2.0.0
    • lastProgress

      StreamingQueryProgress lastProgress()
      Returns the most recent StreamingQueryProgress update of this streaming query.

      Returns:
      (undocumented)
      Since:
      2.1.0
    • name

      String name()
      Returns the user-specified name of the query, or null if not specified. This name can be specified in the org.apache.spark.sql.streaming.DataStreamWriter as dataframe.writeStream.queryName("query").start(). This name, if set, must be unique across all active queries.

      Returns:
      (undocumented)
      Since:
      2.0.0
    • processAllAvailable

      void processAllAvailable()
      Blocks until all available data in the source has been processed and committed to the sink. This method is intended for testing. Note that in the case of continually arriving data, this method may block forever. Additionally, this method is only guaranteed to block until data that has been synchronously appended data to a org.apache.spark.sql.execution.streaming.Source prior to invocation. (i.e. getOffset must immediately reflect the addition).
      Since:
      2.0.0
    • recentProgress

      StreamingQueryProgress[] recentProgress()
      Returns an array of the most recent StreamingQueryProgress updates for this query. The number of progress updates retained for each stream is configured by Spark session configuration spark.sql.streaming.numRecentProgressUpdates.

      Returns:
      (undocumented)
      Since:
      2.1.0
    • runId

      UUID runId()
      Returns the unique id of this run of the query. That is, every start/restart of a query will generate a unique runId. Therefore, every time a query is restarted from checkpoint, it will have the same id() but different runId()s.
      Returns:
      (undocumented)
    • sparkSession

      SparkSession sparkSession()
      Returns the SparkSession associated with this.

      Returns:
      (undocumented)
      Since:
      2.0.0
    • status

      Returns the current status of the query.

      Returns:
      (undocumented)
      Since:
      2.0.2
    • stop

      void stop() throws TimeoutException
      Stops the execution of this query if it is running. This waits until the termination of the query execution threads or until a timeout is hit.

      By default stop will block indefinitely. You can configure a timeout by the configuration spark.sql.streaming.stopTimeout. A timeout of 0 (or negative) milliseconds will block indefinitely. If a TimeoutException is thrown, users can retry stopping the stream. If the issue persists, it is advisable to kill the Spark application.

      Throws:
      TimeoutException
      Since:
      2.0.0