Package org.apache.spark.sql
Class Observation
Object
org.apache.spark.sql.Observation
Helper class to simplify usage of
Dataset.observe(String, Column, Column*)
:
// Observe row count (rows) and highest id (maxid) in the Dataset while writing it
val observation = Observation("my metrics")
val observed_ds = ds.observe(observation, count(lit(1)).as("rows"), max($"id").as("maxid"))
observed_ds.write.parquet("ds.parquet")
val metrics = observation.get
This collects the metrics while the first action is executed on the observed dataset.
Subsequent actions do not modify the metrics returned by get()
. Retrieval of the metric via
get()
blocks until the first action has finished and metrics become available.
This class does not support streaming datasets.
param: name name of the metric
- Since:
- 3.3.0
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic Observation
apply()
Observation constructor for creating an anonymous observation.static Observation
Observation constructor for creating a named observation.future()
Future holding the (yet to be completed) observation.get()
(Scala-specific) Get the observed metrics.(Java-specific) Get the observed metrics.name()
-
Constructor Details
-
Observation
-
Observation
public Observation()Create an Observation with a random name.
-
-
Method Details
-
apply
Observation constructor for creating an anonymous observation.- Returns:
- (undocumented)
-
apply
Observation constructor for creating a named observation.- Parameters:
name
- (undocumented)- Returns:
- (undocumented)
-
name
-
future
Future holding the (yet to be completed) observation.- Returns:
- (undocumented)
-
get
(Scala-specific) Get the observed metrics. This waits for the observed dataset to finish its first action. Only the result of the first action is available. Subsequent actions do not modify the result.- Returns:
- the observed metrics as a
Map[String, Any]
- Throws:
InterruptedException
- interrupted while waiting
-
getAsJava
(Java-specific) Get the observed metrics. This waits for the observed dataset to finish its first action. Only the result of the first action is available. Subsequent actions do not modify the result.- Returns:
- the observed metrics as a
java.util.Map[String, Object]
- Throws:
InterruptedException
- interrupted while waiting
-