expressions

package expressions

Ordering

Alphabetic

Visibility

Public
Protected

Package Members

package javalang
package scalalang

Type Members

abstract class Aggregator[-IN, BUF, OUT] extends Serializable with UserDefinedFunctionLike
A base class for user-defined aggregations, which can be used in Dataset operations to take all of the elements of a group and reduce them to a single value.
A base class for user-defined aggregations, which can be used in Dataset operations to take all of the elements of a group and reduce them to a single value.
For example, the following aggregator extracts an int from a specific class and adds them up:
```
case class Data(i: Int)

val customSummer =  new Aggregator[Data, Int, Int] {
  def zero: Int = 0
  def reduce(b: Int, a: Data): Int = b + a.i
  def merge(b1: Int, b2: Int): Int = b1 + b2
  def finish(r: Int): Int = r
  def bufferEncoder: Encoder[Int] = Encoders.scalaInt
  def outputEncoder: Encoder[Int] = Encoders.scalaInt
}.toColumn()

val ds: Dataset[Data] = ...
val aggregated = ds.select(customSummer)
```
Based loosely on Aggregator from Algebird: https://github.com/twitter/algebird
IN
The input type for the aggregation.
BUF
The type of the intermediate value of the reduction.
OUT
The type of the final output result.
Annotations
@SerialVersionUID()
Since
1.6.0
abstract class MutableAggregationBuffer extends Row
A Row representing a mutable aggregation buffer.
A Row representing a mutable aggregation buffer.
This is not meant to be extended outside of Spark.
Annotations
@Stable()
Since
1.5.0

sealed abstract class UserDefinedFunction extends UserDefinedFunctionLike

A user-defined function.

A user-defined function. To create one, use the udf functions in functions.

As an example:

// Define a UDF that returns true or false based on some numeric score.
val predict = udf((score: Double) => score > 0.5)

// Projects a column that adds a prediction column based on the score column.
df.select( predict(df("score")) )

Annotations: @Stable()
Since: 1.3.0

class Window extends AnyRef

Utility functions for defining window in DataFrames.

// PARTITION BY country ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
Window.partitionBy("country").orderBy("date")
  .rowsBetween(Window.unboundedPreceding, Window.currentRow)

// PARTITION BY country ORDER BY date ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING
Window.partitionBy("country").orderBy("date").rowsBetween(-3, 3)

Annotations: @Stable()
Since: 1.4.0

class WindowSpec extends AnyRef
A window specification that defines the partitioning, ordering, and frame boundaries.
A window specification that defines the partitioning, ordering, and frame boundaries.
Use the static methods in Window to create a WindowSpec.
Annotations
@Stable()
Since
1.4.0

Deprecated Type Members

abstract class UserDefinedAggregateFunction extends Serializable with UserDefinedFunctionLike
The base class for implementing user-defined aggregate functions (UDAF).
The base class for implementing user-defined aggregate functions (UDAF).
Annotations
@Stable() @deprecated
Deprecated
(Since version 3.0.0) Aggregator[IN, BUF, OUT] should now be registered as a UDF via the functions.udaf(agg) method.
Since
1.5.0

Value Members

object SparkUserDefinedFunction extends Serializable
object Window
Utility functions for defining window in DataFrames.
Utility functions for defining window in DataFrames.
```
// PARTITION BY country ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
Window.partitionBy("country").orderBy("date")
  .rowsBetween(Window.unboundedPreceding, Window.currentRow)

// PARTITION BY country ORDER BY date ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING
Window.partitionBy("country").orderBy("date").rowsBetween(-3, 3)
```
Annotations
@Stable()
Since
1.4.0
Note
When ordering is not defined, an unbounded window frame (rowFrame, unboundedPreceding, unboundedFollowing) is used by default. When ordering is defined, a growing window frame (rangeFrame, unboundedPreceding, currentRow) is used by default.

Ungrouped

abstract class Aggregator[-IN, BUF, OUT] extends Serializable with UserDefinedFunctionLike
A base class for user-defined aggregations, which can be used in Dataset operations to take all of the elements of a group and reduce them to a single value.
A base class for user-defined aggregations, which can be used in Dataset operations to take all of the elements of a group and reduce them to a single value.
For example, the following aggregator extracts an int from a specific class and adds them up:
```
case class Data(i: Int)

val customSummer =  new Aggregator[Data, Int, Int] {
  def zero: Int = 0
  def reduce(b: Int, a: Data): Int = b + a.i
  def merge(b1: Int, b2: Int): Int = b1 + b2
  def finish(r: Int): Int = r
  def bufferEncoder: Encoder[Int] = Encoders.scalaInt
  def outputEncoder: Encoder[Int] = Encoders.scalaInt
}.toColumn()

val ds: Dataset[Data] = ...
val aggregated = ds.select(customSummer)
```
Based loosely on Aggregator from Algebird: https://github.com/twitter/algebird
IN
The input type for the aggregation.
BUF
The type of the intermediate value of the reduction.
OUT
The type of the final output result.
Annotations
@SerialVersionUID()
Since
1.6.0
abstract class MutableAggregationBuffer extends Row
A Row representing a mutable aggregation buffer.
A Row representing a mutable aggregation buffer.
This is not meant to be extended outside of Spark.
Annotations
@Stable()
Since
1.5.0