A base class for user-defined aggregations, which can be used in Dataset operations to take all of the elements of a group and reduce them to a single value.
:: Experimental :: A Row representing an mutable aggregation buffer.
:: Experimental :: A Row representing an mutable aggregation buffer.
This is not meant to be extended outside of Spark.
:: Experimental :: The base class for implementing user-defined aggregate functions (UDAF).
:: Experimental :: The base class for implementing user-defined aggregate functions (UDAF).
A user-defined function.
A user-defined function. To create one, use the udf
functions in functions.
As an example:
// Defined a UDF that returns true or false based on some numeric score. val predict = udf((score: Double) => if (score > 0.5) true else false) // Projects a column that adds a prediction column based on the score column. df.select( predict(df("score")) )
1.3.0
:: Experimental :: Utility functions for defining window in DataFrames.
:: Experimental :: Utility functions for defining window in DataFrames.
// PARTITION BY country ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW Window.partitionBy("country").orderBy("date").rowsBetween(Long.MinValue, 0) // PARTITION BY country ORDER BY date ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING Window.partitionBy("country").orderBy("date").rowsBetween(-3, 3)
1.4.0
:: Experimental :: A window specification that defines the partitioning, ordering, and frame boundaries.
:: Experimental :: A window specification that defines the partitioning, ordering, and frame boundaries.
Use the static methods in Window to create a WindowSpec.
1.4.0
:: Experimental :: Utility functions for defining window in DataFrames.
:: Experimental :: Utility functions for defining window in DataFrames.
// PARTITION BY country ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW Window.partitionBy("country").orderBy("date").rowsBetween(Long.MinValue, 0) // PARTITION BY country ORDER BY date ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING Window.partitionBy("country").orderBy("date").rowsBetween(-3, 3)
1.4.0
A base class for user-defined aggregations, which can be used in Dataset operations to take all of the elements of a group and reduce them to a single value.
For example, the following aggregator extracts an
int
from a specific class and adds them up:Based loosely on Aggregator from Algebird: https://github.com/twitter/algebird
The input type for the aggregation.
The type of the intermediate value of the reduction.
The type of the final output result.
1.6.0