Class StreamingQueryProgress

Object
org.apache.spark.sql.streaming.StreamingQueryProgress
All Implemented Interfaces:
Serializable, scala.Serializable

public class StreamingQueryProgress extends Object implements scala.Serializable
Information about progress made in the execution of a StreamingQuery during a trigger. Each event relates to processing done for a single trigger of the streaming query. Events are emitted even when no new data is available to be processed.

param: id A unique query id that persists across restarts. See StreamingQuery.id(). param: runId A query id that is unique for every start/restart. See StreamingQuery.runId(). param: name User-specified name of the query, null if not specified. param: timestamp Beginning time of the trigger in ISO8601 format, i.e. UTC timestamps. param: batchId A unique id for the current batch of data being processed. Note that in the case of retries after a failure a given batchId my be executed more than once. Similarly, when there is no data to be processed, the batchId will not be incremented. param: batchDuration The process duration of each batch. param: durationMs The amount of time taken to perform various operations in milliseconds. param: eventTime Statistics of event time seen in this batch. It may contain the following keys:


                   "max" -> "2016-12-05T20:54:20.827Z"  // maximum event time seen in this trigger
                   "min" -> "2016-12-05T20:54:20.827Z"  // minimum event time seen in this trigger
                   "avg" -> "2016-12-05T20:54:20.827Z"  // average event time seen in this trigger
                   "watermark" -> "2016-12-05T20:54:20.827Z"  // watermark used in this trigger
                 
All timestamps are in ISO8601 format, i.e. UTC timestamps. param: stateOperators Information about operators in the query that store state. param: sources detailed statistics on data being read from each of the streaming sources.
Since:
2.1.0
See Also:
  • Method Details

    • id

      public UUID id()
    • runId

      public UUID runId()
    • name

      public String name()
    • timestamp

      public String timestamp()
    • batchId

      public long batchId()
    • batchDuration

      public long batchDuration()
    • durationMs

      public Map<String,Long> durationMs()
    • eventTime

      public Map<String,String> eventTime()
    • stateOperators

      public StateOperatorProgress[] stateOperators()
    • sources

      public SourceProgress[] sources()
    • sink

      public SinkProgress sink()
    • observedMetrics

      public Map<String,Row> observedMetrics()
    • numInputRows

      public long numInputRows()
      The aggregate (across all sources) number of records processed in a trigger.
    • inputRowsPerSecond

      public double inputRowsPerSecond()
      The aggregate (across all sources) rate of data arriving.
    • processedRowsPerSecond

      public double processedRowsPerSecond()
      The aggregate (across all sources) rate at which Spark is processing data.
    • json

      public String json()
      The compact JSON representation of this progress.
    • prettyJson

      public String prettyJson()
      The pretty (i.e. indented) JSON representation of this progress.
    • toString

      public String toString()
      Overrides:
      toString in class Object