 functions

object functions

:: Experimental :: Functions available for DataFrame.

Annotations
()
Since

1.3.0

Linear Supertypes
AnyRef, Any
Ordering
1. Grouped
2. Alphabetic
3. By inheritance
Inherited
1. functions
2. AnyRef
3. Any
1. Hide All
2. Show all
Visibility
1. Public
2. All

Value Members

1. final def !=(arg0: AnyRef): Boolean

Definition Classes
AnyRef
2. final def !=(arg0: Any): Boolean

Definition Classes
Any
3. final def ##(): Int

Definition Classes
AnyRef → Any
4. final def ==(arg0: AnyRef): Boolean

Definition Classes
AnyRef
5. final def ==(arg0: Any): Boolean

Definition Classes
Any
6. def abs(e: Column): Column

Computes the absolute value.

Computes the absolute value.

Since

1.3.0

7. def acos(columnName: String): Column

Computes the cosine inverse of the given column; the returned angle is in the range 0.

Computes the cosine inverse of the given column; the returned angle is in the range 0.0 through pi.

Since

1.4.0

8. def acos(e: Column): Column

Computes the cosine inverse of the given value; the returned angle is in the range 0.

Computes the cosine inverse of the given value; the returned angle is in the range 0.0 through pi.

Since

1.4.0

9. def approxCountDistinct(columnName: String, rsd: Double): Column

Aggregate function: returns the approximate number of distinct items in a group.

Aggregate function: returns the approximate number of distinct items in a group.

Since

1.3.0

10. def approxCountDistinct(e: Column, rsd: Double): Column

Aggregate function: returns the approximate number of distinct items in a group.

Aggregate function: returns the approximate number of distinct items in a group.

Since

1.3.0

11. def approxCountDistinct(columnName: String): Column

Aggregate function: returns the approximate number of distinct items in a group.

Aggregate function: returns the approximate number of distinct items in a group.

Since

1.3.0

12. def approxCountDistinct(e: Column): Column

Aggregate function: returns the approximate number of distinct items in a group.

Aggregate function: returns the approximate number of distinct items in a group.

Since

1.3.0

13. def array(colName: String, colNames: String*): Column

Creates a new array column.

Creates a new array column. The input columns must all have the same data type.

Since

1.4.0

14. def array(cols: Column*): Column

Creates a new array column.

Creates a new array column. The input columns must all have the same data type.

Annotations
@varargs()
Since

1.4.0

15. final def asInstanceOf[T0]: T0

Definition Classes
Any
16. def asc(columnName: String): Column

Returns a sort expression based on ascending order of the column.

Returns a sort expression based on ascending order of the column.

```// Sort by dept in ascending order, and then age in descending order.
df.sort(asc("dept"), desc("age"))```
Since

1.3.0

17. def asin(columnName: String): Column

Computes the sine inverse of the given column; the returned angle is in the range -pi/2 through pi/2.

Computes the sine inverse of the given column; the returned angle is in the range -pi/2 through pi/2.

Since

1.4.0

18. def asin(e: Column): Column

Computes the sine inverse of the given value; the returned angle is in the range -pi/2 through pi/2.

Computes the sine inverse of the given value; the returned angle is in the range -pi/2 through pi/2.

Since

1.4.0

19. def atan(columnName: String): Column

Computes the tangent inverse of the given column.

Computes the tangent inverse of the given column.

Since

1.4.0

20. def atan(e: Column): Column

Computes the tangent inverse of the given value.

Computes the tangent inverse of the given value.

Since

1.4.0

21. def atan2(l: Double, rightName: String): Column

Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

Since

1.4.0

22. def atan2(l: Double, r: Column): Column

Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

Since

1.4.0

23. def atan2(leftName: String, r: Double): Column

Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

Since

1.4.0

24. def atan2(l: Column, r: Double): Column

Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

Since

1.4.0

25. def atan2(leftName: String, rightName: String): Column

Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

Since

1.4.0

26. def atan2(leftName: String, r: Column): Column

Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

Since

1.4.0

27. def atan2(l: Column, rightName: String): Column

Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

Since

1.4.0

28. def atan2(l: Column, r: Column): Column

Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

Since

1.4.0

29. def avg(columnName: String): Column

Aggregate function: returns the average of the values in a group.

Aggregate function: returns the average of the values in a group.

Since

1.3.0

30. def avg(e: Column): Column

Aggregate function: returns the average of the values in a group.

Aggregate function: returns the average of the values in a group.

Since

1.3.0

31. def bitwiseNOT(e: Column): Column

Computes bitwise NOT.

Computes bitwise NOT.

Since

1.4.0

32. def callUDF(f: Function10[_, _, _, _, _, _, _, _, _, _, _], returnType: DataType, arg1: Column, arg2: Column, arg3: Column, arg4: Column, arg5: Column, arg6: Column, arg7: Column, arg8: Column, arg9: Column, arg10: Column): Column

Call a Scala function of 10 arguments as user-defined function (UDF).

Call a Scala function of 10 arguments as user-defined function (UDF). This requires you to specify the return data type.

Since

1.3.0

33. def callUDF(f: Function9[_, _, _, _, _, _, _, _, _, _], returnType: DataType, arg1: Column, arg2: Column, arg3: Column, arg4: Column, arg5: Column, arg6: Column, arg7: Column, arg8: Column, arg9: Column): Column

Call a Scala function of 9 arguments as user-defined function (UDF).

Call a Scala function of 9 arguments as user-defined function (UDF). This requires you to specify the return data type.

Since

1.3.0

34. def callUDF(f: Function8[_, _, _, _, _, _, _, _, _], returnType: DataType, arg1: Column, arg2: Column, arg3: Column, arg4: Column, arg5: Column, arg6: Column, arg7: Column, arg8: Column): Column

Call a Scala function of 8 arguments as user-defined function (UDF).

Call a Scala function of 8 arguments as user-defined function (UDF). This requires you to specify the return data type.

Since

1.3.0

35. def callUDF(f: Function7[_, _, _, _, _, _, _, _], returnType: DataType, arg1: Column, arg2: Column, arg3: Column, arg4: Column, arg5: Column, arg6: Column, arg7: Column): Column

Call a Scala function of 7 arguments as user-defined function (UDF).

Call a Scala function of 7 arguments as user-defined function (UDF). This requires you to specify the return data type.

Since

1.3.0

36. def callUDF(f: Function6[_, _, _, _, _, _, _], returnType: DataType, arg1: Column, arg2: Column, arg3: Column, arg4: Column, arg5: Column, arg6: Column): Column

Call a Scala function of 6 arguments as user-defined function (UDF).

Call a Scala function of 6 arguments as user-defined function (UDF). This requires you to specify the return data type.

Since

1.3.0

37. def callUDF(f: Function5[_, _, _, _, _, _], returnType: DataType, arg1: Column, arg2: Column, arg3: Column, arg4: Column, arg5: Column): Column

Call a Scala function of 5 arguments as user-defined function (UDF).

Call a Scala function of 5 arguments as user-defined function (UDF). This requires you to specify the return data type.

Since

1.3.0

38. def callUDF(f: Function4[_, _, _, _, _], returnType: DataType, arg1: Column, arg2: Column, arg3: Column, arg4: Column): Column

Call a Scala function of 4 arguments as user-defined function (UDF).

Call a Scala function of 4 arguments as user-defined function (UDF). This requires you to specify the return data type.

Since

1.3.0

39. def callUDF(f: Function3[_, _, _, _], returnType: DataType, arg1: Column, arg2: Column, arg3: Column): Column

Call a Scala function of 3 arguments as user-defined function (UDF).

Call a Scala function of 3 arguments as user-defined function (UDF). This requires you to specify the return data type.

Since

1.3.0

40. def callUDF(f: Function2[_, _, _], returnType: DataType, arg1: Column, arg2: Column): Column

Call a Scala function of 2 arguments as user-defined function (UDF).

Call a Scala function of 2 arguments as user-defined function (UDF). This requires you to specify the return data type.

Since

1.3.0

41. def callUDF(f: Function1[_, _], returnType: DataType, arg1: Column): Column

Call a Scala function of 1 arguments as user-defined function (UDF).

Call a Scala function of 1 arguments as user-defined function (UDF). This requires you to specify the return data type.

Since

1.3.0

42. def callUDF(f: Function0[_], returnType: DataType): Column

Call a Scala function of 0 arguments as user-defined function (UDF).

Call a Scala function of 0 arguments as user-defined function (UDF). This requires you to specify the return data type.

Since

1.3.0

43. def callUdf(udfName: String, cols: Column*): Column

Call an user-defined function.

Call an user-defined function. Example:

```import org.apache.spark.sql._

val df = Seq(("id1", 1), ("id2", 4), ("id3", 5)).toDF("id", "value")
val sqlContext = df.sqlContext
sqlContext.udf.register("simpleUdf", (v: Int) => v * v)
df.select(\$"id", callUdf("simpleUdf", \$"value"))```
Since

1.4.0

44. def cbrt(columnName: String): Column

Computes the cube-root of the given column.

Computes the cube-root of the given column.

Since

1.4.0

45. def cbrt(e: Column): Column

Computes the cube-root of the given value.

Computes the cube-root of the given value.

Since

1.4.0

46. def ceil(columnName: String): Column

Computes the ceiling of the given column.

Computes the ceiling of the given column.

Since

1.4.0

47. def ceil(e: Column): Column

Computes the ceiling of the given value.

Computes the ceiling of the given value.

Since

1.4.0

48. def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
49. def coalesce(e: Column*): Column

Returns the first column that is not null.

Returns the first column that is not null.

`df.select(coalesce(df("a"), df("b")))`
Annotations
@varargs()
Since

1.3.0

50. def col(colName: String): Column

Returns a Column based on the given column name.

Returns a Column based on the given column name.

Since

1.3.0

51. def column(colName: String): Column

Returns a Column based on the given column name.

Returns a Column based on the given column name. Alias of col.

Since

1.3.0

52. def cos(columnName: String): Column

Computes the cosine of the given column.

Computes the cosine of the given column.

Since

1.4.0

53. def cos(e: Column): Column

Computes the cosine of the given value.

Computes the cosine of the given value.

Since

1.4.0

54. def cosh(columnName: String): Column

Computes the hyperbolic cosine of the given column.

Computes the hyperbolic cosine of the given column.

Since

1.4.0

55. def cosh(e: Column): Column

Computes the hyperbolic cosine of the given value.

Computes the hyperbolic cosine of the given value.

Since

1.4.0

56. def count(columnName: String): Column

Aggregate function: returns the number of items in a group.

Aggregate function: returns the number of items in a group.

Since

1.3.0

57. def count(e: Column): Column

Aggregate function: returns the number of items in a group.

Aggregate function: returns the number of items in a group.

Since

1.3.0

58. def countDistinct(columnName: String, columnNames: String*): Column

Aggregate function: returns the number of distinct items in a group.

Aggregate function: returns the number of distinct items in a group.

Annotations
@varargs()
Since

1.3.0

59. def countDistinct(expr: Column, exprs: Column*): Column

Aggregate function: returns the number of distinct items in a group.

Aggregate function: returns the number of distinct items in a group.

Annotations
@varargs()
Since

1.3.0

60. def cumeDist(): Column

Window function: returns the cumulative distribution of values within a window partition, i.

Window function: returns the cumulative distribution of values within a window partition, i.e. the fraction of rows that are below the current row.

```N = total number of rows in the partition
cumeDist(x) = number of values before (and including) x / N```

This is equivalent to the CUME_DIST function in SQL.

Since

1.4.0

61. def denseRank(): Column

Window function: returns the rank of rows within a window partition, without any gaps.

Window function: returns the rank of rows within a window partition, without any gaps.

The difference between rank and denseRank is that denseRank leaves no gaps in ranking sequence when there are ties. That is, if you were ranking a competition using denseRank and had three people tie for second place, you would say that all three were in second place and that the next person came in third.

This is equivalent to the DENSE_RANK function in SQL.

Since

1.4.0

62. def desc(columnName: String): Column

Returns a sort expression based on the descending order of the column.

Returns a sort expression based on the descending order of the column.

```// Sort by dept in ascending order, and then age in descending order.
df.sort(asc("dept"), desc("age"))```
Since

1.3.0

63. final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
64. def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
65. def exp(columnName: String): Column

Computes the exponential of the given column.

Computes the exponential of the given column.

Since

1.4.0

66. def exp(e: Column): Column

Computes the exponential of the given value.

Computes the exponential of the given value.

Since

1.4.0

67. def explode(e: Column): Column

Creates a new row for each element in the given array or map column.

68. def expm1(columnName: String): Column

Computes the exponential of the given column.

Computes the exponential of the given column.

Since

1.4.0

69. def expm1(e: Column): Column

Computes the exponential of the given value minus one.

Computes the exponential of the given value minus one.

Since

1.4.0

70. def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
71. def first(columnName: String): Column

Aggregate function: returns the first value of a column in a group.

Aggregate function: returns the first value of a column in a group.

Since

1.3.0

72. def first(e: Column): Column

Aggregate function: returns the first value in a group.

Aggregate function: returns the first value in a group.

Since

1.3.0

73. def floor(columnName: String): Column

Computes the floor of the given column.

Computes the floor of the given column.

Since

1.4.0

74. def floor(e: Column): Column

Computes the floor of the given value.

Computes the floor of the given value.

Since

1.4.0

75. final def getClass(): Class[_]

Definition Classes
AnyRef → Any
76. def hashCode(): Int

Definition Classes
AnyRef → Any
77. def hypot(l: Double, rightName: String): Column

Computes `sqrt(a2 + b2)` without intermediate overflow or underflow.

Computes `sqrt(a2 + b2)` without intermediate overflow or underflow.

Since

1.4.0

78. def hypot(l: Double, r: Column): Column

Computes `sqrt(a2 + b2)` without intermediate overflow or underflow.

Computes `sqrt(a2 + b2)` without intermediate overflow or underflow.

Since

1.4.0

79. def hypot(leftName: String, r: Double): Column

Computes `sqrt(a2 + b2)` without intermediate overflow or underflow.

Computes `sqrt(a2 + b2)` without intermediate overflow or underflow.

Since

1.4.0

80. def hypot(l: Column, r: Double): Column

Computes `sqrt(a2 + b2)` without intermediate overflow or underflow.

Computes `sqrt(a2 + b2)` without intermediate overflow or underflow.

Since

1.4.0

81. def hypot(leftName: String, rightName: String): Column

Computes `sqrt(a2 + b2)` without intermediate overflow or underflow.

Computes `sqrt(a2 + b2)` without intermediate overflow or underflow.

Since

1.4.0

82. def hypot(leftName: String, r: Column): Column

Computes `sqrt(a2 + b2)` without intermediate overflow or underflow.

Computes `sqrt(a2 + b2)` without intermediate overflow or underflow.

Since

1.4.0

83. def hypot(l: Column, rightName: String): Column

Computes `sqrt(a2 + b2)` without intermediate overflow or underflow.

Computes `sqrt(a2 + b2)` without intermediate overflow or underflow.

Since

1.4.0

84. def hypot(l: Column, r: Column): Column

Computes `sqrt(a2 + b2)` without intermediate overflow or underflow.

Computes `sqrt(a2 + b2)` without intermediate overflow or underflow.

Since

1.4.0

85. final def isInstanceOf[T0]: Boolean

Definition Classes
Any
86. def lag(e: Column, offset: Int, defaultValue: Any): Column

Window function: returns the value that is `offset` rows before the current row, and `defaultValue` if there is less than `offset` rows before the current row.

Window function: returns the value that is `offset` rows before the current row, and `defaultValue` if there is less than `offset` rows before the current row. For example, an `offset` of one will return the previous row at any given point in the window partition.

This is equivalent to the LAG function in SQL.

Since

1.4.0

87. def lag(columnName: String, offset: Int, defaultValue: Any): Column

Window function: returns the value that is `offset` rows before the current row, and `defaultValue` if there is less than `offset` rows before the current row.

Window function: returns the value that is `offset` rows before the current row, and `defaultValue` if there is less than `offset` rows before the current row. For example, an `offset` of one will return the previous row at any given point in the window partition.

This is equivalent to the LAG function in SQL.

Since

1.4.0

88. def lag(columnName: String, offset: Int): Column

Window function: returns the value that is `offset` rows before the current row, and `null` if there is less than `offset` rows before the current row.

Window function: returns the value that is `offset` rows before the current row, and `null` if there is less than `offset` rows before the current row. For example, an `offset` of one will return the previous row at any given point in the window partition.

This is equivalent to the LAG function in SQL.

Since

1.4.0

89. def lag(e: Column, offset: Int): Column

Window function: returns the value that is `offset` rows before the current row, and `null` if there is less than `offset` rows before the current row.

Window function: returns the value that is `offset` rows before the current row, and `null` if there is less than `offset` rows before the current row. For example, an `offset` of one will return the previous row at any given point in the window partition.

This is equivalent to the LAG function in SQL.

Since

1.4.0

90. def last(columnName: String): Column

Aggregate function: returns the last value of the column in a group.

Aggregate function: returns the last value of the column in a group.

Since

1.3.0

91. def last(e: Column): Column

Aggregate function: returns the last value in a group.

Aggregate function: returns the last value in a group.

Since

1.3.0

92. def lead(e: Column, offset: Int, defaultValue: Any): Column

Window function: returns the value that is `offset` rows after the current row, and `defaultValue` if there is less than `offset` rows after the current row.

Window function: returns the value that is `offset` rows after the current row, and `defaultValue` if there is less than `offset` rows after the current row. For example, an `offset` of one will return the next row at any given point in the window partition.

This is equivalent to the LEAD function in SQL.

Since

1.4.0

93. def lead(columnName: String, offset: Int, defaultValue: Any): Column

Window function: returns the value that is `offset` rows after the current row, and `defaultValue` if there is less than `offset` rows after the current row.

Window function: returns the value that is `offset` rows after the current row, and `defaultValue` if there is less than `offset` rows after the current row. For example, an `offset` of one will return the next row at any given point in the window partition.

This is equivalent to the LEAD function in SQL.

Since

1.4.0

94. def lead(e: Column, offset: Int): Column

Window function: returns the value that is `offset` rows after the current row, and `null` if there is less than `offset` rows after the current row.

Window function: returns the value that is `offset` rows after the current row, and `null` if there is less than `offset` rows after the current row. For example, an `offset` of one will return the next row at any given point in the window partition.

This is equivalent to the LEAD function in SQL.

Since

1.4.0

95. def lead(columnName: String, offset: Int): Column

Window function: returns the value that is `offset` rows after the current row, and `null` if there is less than `offset` rows after the current row.

Window function: returns the value that is `offset` rows after the current row, and `null` if there is less than `offset` rows after the current row. For example, an `offset` of one will return the next row at any given point in the window partition.

This is equivalent to the LEAD function in SQL.

Since

1.4.0

96. def lit(literal: Any): Column

Creates a Column of literal value.

Creates a Column of literal value.

The passed in object is returned directly if it is already a Column. If the object is a Scala Symbol, it is converted into a Column also. Otherwise, a new Column is created to represent the literal value.

Since

1.3.0

97. def log(columnName: String): Column

Computes the natural logarithm of the given column.

Computes the natural logarithm of the given column.

Since

1.4.0

98. def log(e: Column): Column

Computes the natural logarithm of the given value.

Computes the natural logarithm of the given value.

Since

1.4.0

99. def log10(columnName: String): Column

Computes the logarithm of the given value in Base 10.

Computes the logarithm of the given value in Base 10.

Since

1.4.0

100. def log10(e: Column): Column

Computes the logarithm of the given value in Base 10.

Computes the logarithm of the given value in Base 10.

Since

1.4.0

101. def log1p(columnName: String): Column

Computes the natural logarithm of the given column plus one.

Computes the natural logarithm of the given column plus one.

Since

1.4.0

102. def log1p(e: Column): Column

Computes the natural logarithm of the given value plus one.

Computes the natural logarithm of the given value plus one.

Since

1.4.0

103. def lower(e: Column): Column

Converts a string exprsesion to lower case.

Converts a string exprsesion to lower case.

Since

1.3.0

104. def max(columnName: String): Column

Aggregate function: returns the maximum value of the column in a group.

Aggregate function: returns the maximum value of the column in a group.

Since

1.3.0

105. def max(e: Column): Column

Aggregate function: returns the maximum value of the expression in a group.

Aggregate function: returns the maximum value of the expression in a group.

Since

1.3.0

106. def mean(columnName: String): Column

Aggregate function: returns the average of the values in a group.

Aggregate function: returns the average of the values in a group. Alias for avg.

Since

1.4.0

107. def mean(e: Column): Column

Aggregate function: returns the average of the values in a group.

Aggregate function: returns the average of the values in a group. Alias for avg.

Since

1.4.0

108. def min(columnName: String): Column

Aggregate function: returns the minimum value of the column in a group.

Aggregate function: returns the minimum value of the column in a group.

Since

1.3.0

109. def min(e: Column): Column

Aggregate function: returns the minimum value of the expression in a group.

Aggregate function: returns the minimum value of the expression in a group.

Since

1.3.0

110. def monotonicallyIncreasingId(): Column

A column expression that generates monotonically increasing 64-bit integers.

A column expression that generates monotonically increasing 64-bit integers.

The generated ID is guaranteed to be monotonically increasing and unique, but not consecutive. The current implementation puts the partition ID in the upper 31 bits, and the record number within each partition in the lower 33 bits. The assumption is that the data frame has less than 1 billion partitions, and each partition has less than 8 billion records.

As an example, consider a DataFrame with two partitions, each with 3 records. This expression would return the following IDs: 0, 1, 2, 8589934592 (1L << 33), 8589934593, 8589934594.

Since

1.4.0

111. final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
112. def negate(e: Column): Column

Unary minus, i.

Unary minus, i.e. negate the expression.

```// Select the amount column and negates all values.
// Scala:
df.select( -df("amount") )

// Java:
df.select( negate(df.col("amount")) );```
Since

1.3.0

113. def not(e: Column): Column

Inversion of boolean expression, i.

Inversion of boolean expression, i.e. NOT.

```// Scala: select rows that are not active (isActive === false)
df.filter( !df("isActive") )

// Java:
df.filter( not(df.col("isActive")) );```
Since

1.3.0

114. final def notify(): Unit

Definition Classes
AnyRef
115. final def notifyAll(): Unit

Definition Classes
AnyRef
116. def ntile(n: Int): Column

Window function: returns the ntile group id (from 1 to `n` inclusive) in an ordered window partition.

Window function: returns the ntile group id (from 1 to `n` inclusive) in an ordered window partition. Fow example, if `n` is 4, the first quarter of the rows will get value 1, the second quarter will get 2, the third quarter will get 3, and the last quarter will get 4.

This is equivalent to the NTILE function in SQL.

Since

1.4.0

117. def percentRank(): Column

Window function: returns the relative rank (i.

Window function: returns the relative rank (i.e. percentile) of rows within a window partition.

This is computed by:

`(rank of row in its partition - 1) / (number of rows in the partition - 1)`

This is equivalent to the PERCENT_RANK function in SQL.

Since

1.4.0

118. def pow(l: Double, rightName: String): Column

Returns the value of the first argument raised to the power of the second argument.

Returns the value of the first argument raised to the power of the second argument.

Since

1.4.0

119. def pow(l: Double, r: Column): Column

Returns the value of the first argument raised to the power of the second argument.

Returns the value of the first argument raised to the power of the second argument.

Since

1.4.0

120. def pow(leftName: String, r: Double): Column

Returns the value of the first argument raised to the power of the second argument.

Returns the value of the first argument raised to the power of the second argument.

Since

1.4.0

121. def pow(l: Column, r: Double): Column

Returns the value of the first argument raised to the power of the second argument.

Returns the value of the first argument raised to the power of the second argument.

Since

1.4.0

122. def pow(leftName: String, rightName: String): Column

Returns the value of the first argument raised to the power of the second argument.

Returns the value of the first argument raised to the power of the second argument.

Since

1.4.0

123. def pow(leftName: String, r: Column): Column

Returns the value of the first argument raised to the power of the second argument.

Returns the value of the first argument raised to the power of the second argument.

Since

1.4.0

124. def pow(l: Column, rightName: String): Column

Returns the value of the first argument raised to the power of the second argument.

Returns the value of the first argument raised to the power of the second argument.

Since

1.4.0

125. def pow(l: Column, r: Column): Column

Returns the value of the first argument raised to the power of the second argument.

Returns the value of the first argument raised to the power of the second argument.

Since

1.4.0

126. def rand(): Column

Generate a random column with i.

Generate a random column with i.i.d. samples from U[0.0, 1.0].

Since

1.4.0

127. def rand(seed: Long): Column

Generate a random column with i.

Generate a random column with i.i.d. samples from U[0.0, 1.0].

Since

1.4.0

128. def randn(): Column

Generate a column with i.

Generate a column with i.i.d. samples from the standard normal distribution.

Since

1.4.0

129. def randn(seed: Long): Column

Generate a column with i.

Generate a column with i.i.d. samples from the standard normal distribution.

Since

1.4.0

130. def rank(): Column

Window function: returns the rank of rows within a window partition.

Window function: returns the rank of rows within a window partition.

The difference between rank and denseRank is that denseRank leaves no gaps in ranking sequence when there are ties. That is, if you were ranking a competition using denseRank and had three people tie for second place, you would say that all three were in second place and that the next person came in third.

This is equivalent to the RANK function in SQL.

Since

1.4.0

131. def rint(columnName: String): Column

Returns the double value that is closest in value to the argument and is equal to a mathematical integer.

Returns the double value that is closest in value to the argument and is equal to a mathematical integer.

Since

1.4.0

132. def rint(e: Column): Column

Returns the double value that is closest in value to the argument and is equal to a mathematical integer.

Returns the double value that is closest in value to the argument and is equal to a mathematical integer.

Since

1.4.0

133. def rowNumber(): Column

Window function: returns a sequential number starting at 1 within a window partition.

Window function: returns a sequential number starting at 1 within a window partition.

This is equivalent to the ROW_NUMBER function in SQL.

Since

1.4.0

134. def signum(columnName: String): Column

Computes the signum of the given column.

Computes the signum of the given column.

Since

1.4.0

135. def signum(e: Column): Column

Computes the signum of the given value.

Computes the signum of the given value.

Since

1.4.0

136. def sin(columnName: String): Column

Computes the sine of the given column.

Computes the sine of the given column.

Since

1.4.0

137. def sin(e: Column): Column

Computes the sine of the given value.

Computes the sine of the given value.

Since

1.4.0

138. def sinh(columnName: String): Column

Computes the hyperbolic sine of the given column.

Computes the hyperbolic sine of the given column.

Since

1.4.0

139. def sinh(e: Column): Column

Computes the hyperbolic sine of the given value.

Computes the hyperbolic sine of the given value.

Since

1.4.0

140. def sparkPartitionId(): Column

Partition ID of the Spark task.

Partition ID of the Spark task.

Note that this is indeterministic because it depends on data partitioning and task scheduling.

Since

1.4.0

141. def sqrt(e: Column): Column

Computes the square root of the specified float value.

Computes the square root of the specified float value.

Since

1.3.0

142. def struct(colName: String, colNames: String*): Column

Creates a new struct column that composes multiple input columns.

Creates a new struct column that composes multiple input columns.

Since

1.4.0

143. def struct(cols: Column*): Column

Creates a new struct column.

Creates a new struct column. The input column must be a column in a DataFrame, or a derived column expression that is named (i.e. aliased).

Annotations
@varargs()
Since

1.4.0

144. def sum(columnName: String): Column

Aggregate function: returns the sum of all values in the given column.

Aggregate function: returns the sum of all values in the given column.

Since

1.3.0

145. def sum(e: Column): Column

Aggregate function: returns the sum of all values in the expression.

Aggregate function: returns the sum of all values in the expression.

Since

1.3.0

146. def sumDistinct(columnName: String): Column

Aggregate function: returns the sum of distinct values in the expression.

Aggregate function: returns the sum of distinct values in the expression.

Since

1.3.0

147. def sumDistinct(e: Column): Column

Aggregate function: returns the sum of distinct values in the expression.

Aggregate function: returns the sum of distinct values in the expression.

Since

1.3.0

148. final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
149. def tan(columnName: String): Column

Computes the tangent of the given column.

Computes the tangent of the given column.

Since

1.4.0

150. def tan(e: Column): Column

Computes the tangent of the given value.

Computes the tangent of the given value.

Since

1.4.0

151. def tanh(columnName: String): Column

Computes the hyperbolic tangent of the given column.

Computes the hyperbolic tangent of the given column.

Since

1.4.0

152. def tanh(e: Column): Column

Computes the hyperbolic tangent of the given value.

Computes the hyperbolic tangent of the given value.

Since

1.4.0

153. def toDegrees(columnName: String): Column

Converts an angle measured in radians to an approximately equivalent angle measured in degrees.

Converts an angle measured in radians to an approximately equivalent angle measured in degrees.

Since

1.4.0

154. def toDegrees(e: Column): Column

Converts an angle measured in radians to an approximately equivalent angle measured in degrees.

Converts an angle measured in radians to an approximately equivalent angle measured in degrees.

Since

1.4.0

Converts an angle measured in degrees to an approximately equivalent angle measured in radians.

Converts an angle measured in degrees to an approximately equivalent angle measured in radians.

Since

1.4.0

Converts an angle measured in degrees to an approximately equivalent angle measured in radians.

Converts an angle measured in degrees to an approximately equivalent angle measured in radians.

Since

1.4.0

157. def toString(): String

Definition Classes
AnyRef → Any
158. def udf[RT, A1, A2, A3, A4, A5, A6, A7, A8, A9, A10](f: (A1, A2, A3, A4, A5, A6, A7, A8, A9, A10) ⇒ RT)(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[RT], arg1: scala.reflect.api.JavaUniverse.TypeTag[A1], arg2: scala.reflect.api.JavaUniverse.TypeTag[A2], arg3: scala.reflect.api.JavaUniverse.TypeTag[A3], arg4: scala.reflect.api.JavaUniverse.TypeTag[A4], arg5: scala.reflect.api.JavaUniverse.TypeTag[A5], arg6: scala.reflect.api.JavaUniverse.TypeTag[A6], arg7: scala.reflect.api.JavaUniverse.TypeTag[A7], arg8: scala.reflect.api.JavaUniverse.TypeTag[A8], arg9: scala.reflect.api.JavaUniverse.TypeTag[A9], arg10: scala.reflect.api.JavaUniverse.TypeTag[A10]): UserDefinedFunction

Defines a user-defined function of 10 arguments as user-defined function (UDF).

Defines a user-defined function of 10 arguments as user-defined function (UDF). The data types are automatically inferred based on the function's signature.

Since

1.3.0

159. def udf[RT, A1, A2, A3, A4, A5, A6, A7, A8, A9](f: (A1, A2, A3, A4, A5, A6, A7, A8, A9) ⇒ RT)(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[RT], arg1: scala.reflect.api.JavaUniverse.TypeTag[A1], arg2: scala.reflect.api.JavaUniverse.TypeTag[A2], arg3: scala.reflect.api.JavaUniverse.TypeTag[A3], arg4: scala.reflect.api.JavaUniverse.TypeTag[A4], arg5: scala.reflect.api.JavaUniverse.TypeTag[A5], arg6: scala.reflect.api.JavaUniverse.TypeTag[A6], arg7: scala.reflect.api.JavaUniverse.TypeTag[A7], arg8: scala.reflect.api.JavaUniverse.TypeTag[A8], arg9: scala.reflect.api.JavaUniverse.TypeTag[A9]): UserDefinedFunction

Defines a user-defined function of 9 arguments as user-defined function (UDF).

Defines a user-defined function of 9 arguments as user-defined function (UDF). The data types are automatically inferred based on the function's signature.

Since

1.3.0

160. def udf[RT, A1, A2, A3, A4, A5, A6, A7, A8](f: (A1, A2, A3, A4, A5, A6, A7, A8) ⇒ RT)(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[RT], arg1: scala.reflect.api.JavaUniverse.TypeTag[A1], arg2: scala.reflect.api.JavaUniverse.TypeTag[A2], arg3: scala.reflect.api.JavaUniverse.TypeTag[A3], arg4: scala.reflect.api.JavaUniverse.TypeTag[A4], arg5: scala.reflect.api.JavaUniverse.TypeTag[A5], arg6: scala.reflect.api.JavaUniverse.TypeTag[A6], arg7: scala.reflect.api.JavaUniverse.TypeTag[A7], arg8: scala.reflect.api.JavaUniverse.TypeTag[A8]): UserDefinedFunction

Defines a user-defined function of 8 arguments as user-defined function (UDF).

Defines a user-defined function of 8 arguments as user-defined function (UDF). The data types are automatically inferred based on the function's signature.

Since

1.3.0

161. def udf[RT, A1, A2, A3, A4, A5, A6, A7](f: (A1, A2, A3, A4, A5, A6, A7) ⇒ RT)(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[RT], arg1: scala.reflect.api.JavaUniverse.TypeTag[A1], arg2: scala.reflect.api.JavaUniverse.TypeTag[A2], arg3: scala.reflect.api.JavaUniverse.TypeTag[A3], arg4: scala.reflect.api.JavaUniverse.TypeTag[A4], arg5: scala.reflect.api.JavaUniverse.TypeTag[A5], arg6: scala.reflect.api.JavaUniverse.TypeTag[A6], arg7: scala.reflect.api.JavaUniverse.TypeTag[A7]): UserDefinedFunction

Defines a user-defined function of 7 arguments as user-defined function (UDF).

Defines a user-defined function of 7 arguments as user-defined function (UDF). The data types are automatically inferred based on the function's signature.

Since

1.3.0

162. def udf[RT, A1, A2, A3, A4, A5, A6](f: (A1, A2, A3, A4, A5, A6) ⇒ RT)(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[RT], arg1: scala.reflect.api.JavaUniverse.TypeTag[A1], arg2: scala.reflect.api.JavaUniverse.TypeTag[A2], arg3: scala.reflect.api.JavaUniverse.TypeTag[A3], arg4: scala.reflect.api.JavaUniverse.TypeTag[A4], arg5: scala.reflect.api.JavaUniverse.TypeTag[A5], arg6: scala.reflect.api.JavaUniverse.TypeTag[A6]): UserDefinedFunction

Defines a user-defined function of 6 arguments as user-defined function (UDF).

Defines a user-defined function of 6 arguments as user-defined function (UDF). The data types are automatically inferred based on the function's signature.

Since

1.3.0

163. def udf[RT, A1, A2, A3, A4, A5](f: (A1, A2, A3, A4, A5) ⇒ RT)(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[RT], arg1: scala.reflect.api.JavaUniverse.TypeTag[A1], arg2: scala.reflect.api.JavaUniverse.TypeTag[A2], arg3: scala.reflect.api.JavaUniverse.TypeTag[A3], arg4: scala.reflect.api.JavaUniverse.TypeTag[A4], arg5: scala.reflect.api.JavaUniverse.TypeTag[A5]): UserDefinedFunction

Defines a user-defined function of 5 arguments as user-defined function (UDF).

Defines a user-defined function of 5 arguments as user-defined function (UDF). The data types are automatically inferred based on the function's signature.

Since

1.3.0

164. def udf[RT, A1, A2, A3, A4](f: (A1, A2, A3, A4) ⇒ RT)(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[RT], arg1: scala.reflect.api.JavaUniverse.TypeTag[A1], arg2: scala.reflect.api.JavaUniverse.TypeTag[A2], arg3: scala.reflect.api.JavaUniverse.TypeTag[A3], arg4: scala.reflect.api.JavaUniverse.TypeTag[A4]): UserDefinedFunction

Defines a user-defined function of 4 arguments as user-defined function (UDF).

Defines a user-defined function of 4 arguments as user-defined function (UDF). The data types are automatically inferred based on the function's signature.

Since

1.3.0

165. def udf[RT, A1, A2, A3](f: (A1, A2, A3) ⇒ RT)(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[RT], arg1: scala.reflect.api.JavaUniverse.TypeTag[A1], arg2: scala.reflect.api.JavaUniverse.TypeTag[A2], arg3: scala.reflect.api.JavaUniverse.TypeTag[A3]): UserDefinedFunction

Defines a user-defined function of 3 arguments as user-defined function (UDF).

Defines a user-defined function of 3 arguments as user-defined function (UDF). The data types are automatically inferred based on the function's signature.

Since

1.3.0

166. def udf[RT, A1, A2](f: (A1, A2) ⇒ RT)(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[RT], arg1: scala.reflect.api.JavaUniverse.TypeTag[A1], arg2: scala.reflect.api.JavaUniverse.TypeTag[A2]): UserDefinedFunction

Defines a user-defined function of 2 arguments as user-defined function (UDF).

Defines a user-defined function of 2 arguments as user-defined function (UDF). The data types are automatically inferred based on the function's signature.

Since

1.3.0

167. def udf[RT, A1](f: (A1) ⇒ RT)(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[RT], arg1: scala.reflect.api.JavaUniverse.TypeTag[A1]): UserDefinedFunction

Defines a user-defined function of 1 arguments as user-defined function (UDF).

Defines a user-defined function of 1 arguments as user-defined function (UDF). The data types are automatically inferred based on the function's signature.

Since

1.3.0

168. def udf[RT](f: () ⇒ RT)(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[RT]): UserDefinedFunction

Defines a user-defined function of 0 arguments as user-defined function (UDF).

Defines a user-defined function of 0 arguments as user-defined function (UDF). The data types are automatically inferred based on the function's signature.

Since

1.3.0

169. def upper(e: Column): Column

Converts a string expression to upper case.

Converts a string expression to upper case.

Since

1.3.0

170. final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
171. final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
172. final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
173. def when(condition: Column, value: Any): Column

Evaluates a list of conditions and returns one of multiple possible result expressions.

Evaluates a list of conditions and returns one of multiple possible result expressions. If otherwise is not defined at the end, null is returned for unmatched conditions.

```// Example: encoding gender string column into integer.

// Scala:
people.select(when(people("gender") === "male", 0)
.when(people("gender") === "female", 1)
.otherwise(2))

// Java:
people.select(when(col("gender").equalTo("male"), 0)
.when(col("gender").equalTo("female"), 1)
.otherwise(2))```
Since

1.4.0