org.apache.spark.sql

functions

object functions

:: Experimental :: Functions available for DataFrame.

Annotations
@Experimental()
Since

1.3.0

Linear Supertypes
AnyRef, Any
Ordering
  1. Grouped
  2. Alphabetic
  3. By inheritance
Inherited
  1. functions
  2. AnyRef
  3. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. def abs(e: Column): Column

    Computes the absolute value.

    Computes the absolute value.

    Since

    1.3.0

  7. def acos(columnName: String): Column

    Computes the cosine inverse of the given column; the returned angle is in the range 0.

    Computes the cosine inverse of the given column; the returned angle is in the range 0.0 through pi.

    Since

    1.4.0

  8. def acos(e: Column): Column

    Computes the cosine inverse of the given value; the returned angle is in the range 0.

    Computes the cosine inverse of the given value; the returned angle is in the range 0.0 through pi.

    Since

    1.4.0

  9. def approxCountDistinct(columnName: String, rsd: Double): Column

    Aggregate function: returns the approximate number of distinct items in a group.

    Aggregate function: returns the approximate number of distinct items in a group.

    Since

    1.3.0

  10. def approxCountDistinct(e: Column, rsd: Double): Column

    Aggregate function: returns the approximate number of distinct items in a group.

    Aggregate function: returns the approximate number of distinct items in a group.

    Since

    1.3.0

  11. def approxCountDistinct(columnName: String): Column

    Aggregate function: returns the approximate number of distinct items in a group.

    Aggregate function: returns the approximate number of distinct items in a group.

    Since

    1.3.0

  12. def approxCountDistinct(e: Column): Column

    Aggregate function: returns the approximate number of distinct items in a group.

    Aggregate function: returns the approximate number of distinct items in a group.

    Since

    1.3.0

  13. def array(colName: String, colNames: String*): Column

    Creates a new array column.

    Creates a new array column. The input columns must all have the same data type.

    Since

    1.4.0

  14. def array(cols: Column*): Column

    Creates a new array column.

    Creates a new array column. The input columns must all have the same data type.

    Annotations
    @varargs()
    Since

    1.4.0

  15. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  16. def asc(columnName: String): Column

    Returns a sort expression based on ascending order of the column.

    Returns a sort expression based on ascending order of the column.

    // Sort by dept in ascending order, and then age in descending order.
    df.sort(asc("dept"), desc("age"))
    Since

    1.3.0

  17. def asin(columnName: String): Column

    Computes the sine inverse of the given column; the returned angle is in the range -pi/2 through pi/2.

    Computes the sine inverse of the given column; the returned angle is in the range -pi/2 through pi/2.

    Since

    1.4.0

  18. def asin(e: Column): Column

    Computes the sine inverse of the given value; the returned angle is in the range -pi/2 through pi/2.

    Computes the sine inverse of the given value; the returned angle is in the range -pi/2 through pi/2.

    Since

    1.4.0

  19. def atan(columnName: String): Column

    Computes the tangent inverse of the given column.

    Computes the tangent inverse of the given column.

    Since

    1.4.0

  20. def atan(e: Column): Column

    Computes the tangent inverse of the given value.

    Computes the tangent inverse of the given value.

    Since

    1.4.0

  21. def atan2(l: Double, rightName: String): Column

    Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

    Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

    Since

    1.4.0

  22. def atan2(l: Double, r: Column): Column

    Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

    Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

    Since

    1.4.0

  23. def atan2(leftName: String, r: Double): Column

    Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

    Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

    Since

    1.4.0

  24. def atan2(l: Column, r: Double): Column

    Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

    Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

    Since

    1.4.0

  25. def atan2(leftName: String, rightName: String): Column

    Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

    Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

    Since

    1.4.0

  26. def atan2(leftName: String, r: Column): Column

    Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

    Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

    Since

    1.4.0

  27. def atan2(l: Column, rightName: String): Column

    Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

    Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

    Since

    1.4.0

  28. def atan2(l: Column, r: Column): Column

    Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

    Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).

    Since

    1.4.0

  29. def avg(columnName: String): Column

    Aggregate function: returns the average of the values in a group.

    Aggregate function: returns the average of the values in a group.

    Since

    1.3.0

  30. def avg(e: Column): Column

    Aggregate function: returns the average of the values in a group.

    Aggregate function: returns the average of the values in a group.

    Since

    1.3.0

  31. def bitwiseNOT(e: Column): Column

    Computes bitwise NOT.

    Computes bitwise NOT.

    Since

    1.4.0

  32. def callUDF(f: Function10[_, _, _, _, _, _, _, _, _, _, _], returnType: DataType, arg1: Column, arg2: Column, arg3: Column, arg4: Column, arg5: Column, arg6: Column, arg7: Column, arg8: Column, arg9: Column, arg10: Column): Column

    Call a Scala function of 10 arguments as user-defined function (UDF).

    Call a Scala function of 10 arguments as user-defined function (UDF). This requires you to specify the return data type.

    Since

    1.3.0

  33. def callUDF(f: Function9[_, _, _, _, _, _, _, _, _, _], returnType: DataType, arg1: Column, arg2: Column, arg3: Column, arg4: Column, arg5: Column, arg6: Column, arg7: Column, arg8: Column, arg9: Column): Column

    Call a Scala function of 9 arguments as user-defined function (UDF).

    Call a Scala function of 9 arguments as user-defined function (UDF). This requires you to specify the return data type.

    Since

    1.3.0

  34. def callUDF(f: Function8[_, _, _, _, _, _, _, _, _], returnType: DataType, arg1: Column, arg2: Column, arg3: Column, arg4: Column, arg5: Column, arg6: Column, arg7: Column, arg8: Column): Column

    Call a Scala function of 8 arguments as user-defined function (UDF).

    Call a Scala function of 8 arguments as user-defined function (UDF). This requires you to specify the return data type.

    Since

    1.3.0

  35. def callUDF(f: Function7[_, _, _, _, _, _, _, _], returnType: DataType, arg1: Column, arg2: Column, arg3: Column, arg4: Column, arg5: Column, arg6: Column, arg7: Column): Column

    Call a Scala function of 7 arguments as user-defined function (UDF).

    Call a Scala function of 7 arguments as user-defined function (UDF). This requires you to specify the return data type.

    Since

    1.3.0

  36. def callUDF(f: Function6[_, _, _, _, _, _, _], returnType: DataType, arg1: Column, arg2: Column, arg3: Column, arg4: Column, arg5: Column, arg6: Column): Column

    Call a Scala function of 6 arguments as user-defined function (UDF).

    Call a Scala function of 6 arguments as user-defined function (UDF). This requires you to specify the return data type.

    Since

    1.3.0

  37. def callUDF(f: Function5[_, _, _, _, _, _], returnType: DataType, arg1: Column, arg2: Column, arg3: Column, arg4: Column, arg5: Column): Column

    Call a Scala function of 5 arguments as user-defined function (UDF).

    Call a Scala function of 5 arguments as user-defined function (UDF). This requires you to specify the return data type.

    Since

    1.3.0

  38. def callUDF(f: Function4[_, _, _, _, _], returnType: DataType, arg1: Column, arg2: Column, arg3: Column, arg4: Column): Column

    Call a Scala function of 4 arguments as user-defined function (UDF).

    Call a Scala function of 4 arguments as user-defined function (UDF). This requires you to specify the return data type.

    Since

    1.3.0

  39. def callUDF(f: Function3[_, _, _, _], returnType: DataType, arg1: Column, arg2: Column, arg3: Column): Column

    Call a Scala function of 3 arguments as user-defined function (UDF).

    Call a Scala function of 3 arguments as user-defined function (UDF). This requires you to specify the return data type.

    Since

    1.3.0

  40. def callUDF(f: Function2[_, _, _], returnType: DataType, arg1: Column, arg2: Column): Column

    Call a Scala function of 2 arguments as user-defined function (UDF).

    Call a Scala function of 2 arguments as user-defined function (UDF). This requires you to specify the return data type.

    Since

    1.3.0

  41. def callUDF(f: Function1[_, _], returnType: DataType, arg1: Column): Column

    Call a Scala function of 1 arguments as user-defined function (UDF).

    Call a Scala function of 1 arguments as user-defined function (UDF). This requires you to specify the return data type.

    Since

    1.3.0

  42. def callUDF(f: Function0[_], returnType: DataType): Column

    Call a Scala function of 0 arguments as user-defined function (UDF).

    Call a Scala function of 0 arguments as user-defined function (UDF). This requires you to specify the return data type.

    Since

    1.3.0

  43. def callUdf(udfName: String, cols: Column*): Column

    Call an user-defined function.

    Call an user-defined function. Example:

    import org.apache.spark.sql._
    
    val df = Seq(("id1", 1), ("id2", 4), ("id3", 5)).toDF("id", "value")
    val sqlContext = df.sqlContext
    sqlContext.udf.register("simpleUdf", (v: Int) => v * v)
    df.select($"id", callUdf("simpleUdf", $"value"))
    Since

    1.4.0

  44. def cbrt(columnName: String): Column

    Computes the cube-root of the given column.

    Computes the cube-root of the given column.

    Since

    1.4.0

  45. def cbrt(e: Column): Column

    Computes the cube-root of the given value.

    Computes the cube-root of the given value.

    Since

    1.4.0

  46. def ceil(columnName: String): Column

    Computes the ceiling of the given column.

    Computes the ceiling of the given column.

    Since

    1.4.0

  47. def ceil(e: Column): Column

    Computes the ceiling of the given value.

    Computes the ceiling of the given value.

    Since

    1.4.0

  48. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  49. def coalesce(e: Column*): Column

    Returns the first column that is not null.

    Returns the first column that is not null.

    df.select(coalesce(df("a"), df("b")))
    Annotations
    @varargs()
    Since

    1.3.0

  50. def col(colName: String): Column

    Returns a Column based on the given column name.

    Returns a Column based on the given column name.

    Since

    1.3.0

  51. def column(colName: String): Column

    Returns a Column based on the given column name.

    Returns a Column based on the given column name. Alias of col.

    Since

    1.3.0

  52. def cos(columnName: String): Column

    Computes the cosine of the given column.

    Computes the cosine of the given column.

    Since

    1.4.0

  53. def cos(e: Column): Column

    Computes the cosine of the given value.

    Computes the cosine of the given value.

    Since

    1.4.0

  54. def cosh(columnName: String): Column

    Computes the hyperbolic cosine of the given column.

    Computes the hyperbolic cosine of the given column.

    Since

    1.4.0

  55. def cosh(e: Column): Column

    Computes the hyperbolic cosine of the given value.

    Computes the hyperbolic cosine of the given value.

    Since

    1.4.0

  56. def count(columnName: String): Column

    Aggregate function: returns the number of items in a group.

    Aggregate function: returns the number of items in a group.

    Since

    1.3.0

  57. def count(e: Column): Column

    Aggregate function: returns the number of items in a group.

    Aggregate function: returns the number of items in a group.

    Since

    1.3.0

  58. def countDistinct(columnName: String, columnNames: String*): Column

    Aggregate function: returns the number of distinct items in a group.

    Aggregate function: returns the number of distinct items in a group.

    Annotations
    @varargs()
    Since

    1.3.0

  59. def countDistinct(expr: Column, exprs: Column*): Column

    Aggregate function: returns the number of distinct items in a group.

    Aggregate function: returns the number of distinct items in a group.

    Annotations
    @varargs()
    Since

    1.3.0

  60. def cumeDist(): Column

    Window function: returns the cumulative distribution of values within a window partition, i.

    Window function: returns the cumulative distribution of values within a window partition, i.e. the fraction of rows that are below the current row.

    N = total number of rows in the partition
    cumeDist(x) = number of values before (and including) x / N

    This is equivalent to the CUME_DIST function in SQL.

    Since

    1.4.0

  61. def denseRank(): Column

    Window function: returns the rank of rows within a window partition, without any gaps.

    Window function: returns the rank of rows within a window partition, without any gaps.

    The difference between rank and denseRank is that denseRank leaves no gaps in ranking sequence when there are ties. That is, if you were ranking a competition using denseRank and had three people tie for second place, you would say that all three were in second place and that the next person came in third.

    This is equivalent to the DENSE_RANK function in SQL.

    Since

    1.4.0

  62. def desc(columnName: String): Column

    Returns a sort expression based on the descending order of the column.

    Returns a sort expression based on the descending order of the column.

    // Sort by dept in ascending order, and then age in descending order.
    df.sort(asc("dept"), desc("age"))
    Since

    1.3.0

  63. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  64. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  65. def exp(columnName: String): Column

    Computes the exponential of the given column.

    Computes the exponential of the given column.

    Since

    1.4.0

  66. def exp(e: Column): Column

    Computes the exponential of the given value.

    Computes the exponential of the given value.

    Since

    1.4.0

  67. def explode(e: Column): Column

    Creates a new row for each element in the given array or map column.

  68. def expm1(columnName: String): Column

    Computes the exponential of the given column.

    Computes the exponential of the given column.

    Since

    1.4.0

  69. def expm1(e: Column): Column

    Computes the exponential of the given value minus one.

    Computes the exponential of the given value minus one.

    Since

    1.4.0

  70. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  71. def first(columnName: String): Column

    Aggregate function: returns the first value of a column in a group.

    Aggregate function: returns the first value of a column in a group.

    Since

    1.3.0

  72. def first(e: Column): Column

    Aggregate function: returns the first value in a group.

    Aggregate function: returns the first value in a group.

    Since

    1.3.0

  73. def floor(columnName: String): Column

    Computes the floor of the given column.

    Computes the floor of the given column.

    Since

    1.4.0

  74. def floor(e: Column): Column

    Computes the floor of the given value.

    Computes the floor of the given value.

    Since

    1.4.0

  75. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  76. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  77. def hypot(l: Double, rightName: String): Column

    Computes sqrt(a2 + b2) without intermediate overflow or underflow.

    Computes sqrt(a2 + b2) without intermediate overflow or underflow.

    Since

    1.4.0

  78. def hypot(l: Double, r: Column): Column

    Computes sqrt(a2 + b2) without intermediate overflow or underflow.

    Computes sqrt(a2 + b2) without intermediate overflow or underflow.

    Since

    1.4.0

  79. def hypot(leftName: String, r: Double): Column

    Computes sqrt(a2 + b2) without intermediate overflow or underflow.

    Computes sqrt(a2 + b2) without intermediate overflow or underflow.

    Since

    1.4.0

  80. def hypot(l: Column, r: Double): Column

    Computes sqrt(a2 + b2) without intermediate overflow or underflow.

    Computes sqrt(a2 + b2) without intermediate overflow or underflow.

    Since

    1.4.0

  81. def hypot(leftName: String, rightName: String): Column

    Computes sqrt(a2 + b2) without intermediate overflow or underflow.

    Computes sqrt(a2 + b2) without intermediate overflow or underflow.

    Since

    1.4.0

  82. def hypot(leftName: String, r: Column): Column

    Computes sqrt(a2 + b2) without intermediate overflow or underflow.

    Computes sqrt(a2 + b2) without intermediate overflow or underflow.

    Since

    1.4.0

  83. def hypot(l: Column, rightName: String): Column

    Computes sqrt(a2 + b2) without intermediate overflow or underflow.

    Computes sqrt(a2 + b2) without intermediate overflow or underflow.

    Since

    1.4.0

  84. def hypot(l: Column, r: Column): Column

    Computes sqrt(a2 + b2) without intermediate overflow or underflow.

    Computes sqrt(a2 + b2) without intermediate overflow or underflow.

    Since

    1.4.0

  85. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  86. def lag(e: Column, offset: Int, defaultValue: Any): Column

    Window function: returns the value that is offset rows before the current row, and defaultValue if there is less than offset rows before the current row.

    Window function: returns the value that is offset rows before the current row, and defaultValue if there is less than offset rows before the current row. For example, an offset of one will return the previous row at any given point in the window partition.

    This is equivalent to the LAG function in SQL.

    Since

    1.4.0

  87. def lag(columnName: String, offset: Int, defaultValue: Any): Column

    Window function: returns the value that is offset rows before the current row, and defaultValue if there is less than offset rows before the current row.

    Window function: returns the value that is offset rows before the current row, and defaultValue if there is less than offset rows before the current row. For example, an offset of one will return the previous row at any given point in the window partition.

    This is equivalent to the LAG function in SQL.

    Since

    1.4.0

  88. def lag(columnName: String, offset: Int): Column

    Window function: returns the value that is offset rows before the current row, and null if there is less than offset rows before the current row.

    Window function: returns the value that is offset rows before the current row, and null if there is less than offset rows before the current row. For example, an offset of one will return the previous row at any given point in the window partition.

    This is equivalent to the LAG function in SQL.

    Since

    1.4.0

  89. def lag(e: Column, offset: Int): Column

    Window function: returns the value that is offset rows before the current row, and null if there is less than offset rows before the current row.

    Window function: returns the value that is offset rows before the current row, and null if there is less than offset rows before the current row. For example, an offset of one will return the previous row at any given point in the window partition.

    This is equivalent to the LAG function in SQL.

    Since

    1.4.0

  90. def last(columnName: String): Column

    Aggregate function: returns the last value of the column in a group.

    Aggregate function: returns the last value of the column in a group.

    Since

    1.3.0

  91. def last(e: Column): Column

    Aggregate function: returns the last value in a group.

    Aggregate function: returns the last value in a group.

    Since

    1.3.0

  92. def lead(e: Column, offset: Int, defaultValue: Any): Column

    Window function: returns the value that is offset rows after the current row, and defaultValue if there is less than offset rows after the current row.

    Window function: returns the value that is offset rows after the current row, and defaultValue if there is less than offset rows after the current row. For example, an offset of one will return the next row at any given point in the window partition.

    This is equivalent to the LEAD function in SQL.

    Since

    1.4.0

  93. def lead(columnName: String, offset: Int, defaultValue: Any): Column

    Window function: returns the value that is offset rows after the current row, and defaultValue if there is less than offset rows after the current row.

    Window function: returns the value that is offset rows after the current row, and defaultValue if there is less than offset rows after the current row. For example, an offset of one will return the next row at any given point in the window partition.

    This is equivalent to the LEAD function in SQL.

    Since

    1.4.0

  94. def lead(e: Column, offset: Int): Column

    Window function: returns the value that is offset rows after the current row, and null if there is less than offset rows after the current row.

    Window function: returns the value that is offset rows after the current row, and null if there is less than offset rows after the current row. For example, an offset of one will return the next row at any given point in the window partition.

    This is equivalent to the LEAD function in SQL.

    Since

    1.4.0

  95. def lead(columnName: String, offset: Int): Column

    Window function: returns the value that is offset rows after the current row, and null if there is less than offset rows after the current row.

    Window function: returns the value that is offset rows after the current row, and null if there is less than offset rows after the current row. For example, an offset of one will return the next row at any given point in the window partition.

    This is equivalent to the LEAD function in SQL.

    Since

    1.4.0

  96. def lit(literal: Any): Column

    Creates a Column of literal value.

    Creates a Column of literal value.

    The passed in object is returned directly if it is already a Column. If the object is a Scala Symbol, it is converted into a Column also. Otherwise, a new Column is created to represent the literal value.

    Since

    1.3.0

  97. def log(columnName: String): Column

    Computes the natural logarithm of the given column.

    Computes the natural logarithm of the given column.

    Since

    1.4.0

  98. def log(e: Column): Column

    Computes the natural logarithm of the given value.

    Computes the natural logarithm of the given value.

    Since

    1.4.0

  99. def log10(columnName: String): Column

    Computes the logarithm of the given value in Base 10.

    Computes the logarithm of the given value in Base 10.

    Since

    1.4.0

  100. def log10(e: Column): Column

    Computes the logarithm of the given value in Base 10.

    Computes the logarithm of the given value in Base 10.

    Since

    1.4.0

  101. def log1p(columnName: String): Column

    Computes the natural logarithm of the given column plus one.

    Computes the natural logarithm of the given column plus one.

    Since

    1.4.0

  102. def log1p(e: Column): Column

    Computes the natural logarithm of the given value plus one.

    Computes the natural logarithm of the given value plus one.

    Since

    1.4.0

  103. def lower(e: Column): Column

    Converts a string exprsesion to lower case.

    Converts a string exprsesion to lower case.

    Since

    1.3.0

  104. def max(columnName: String): Column

    Aggregate function: returns the maximum value of the column in a group.

    Aggregate function: returns the maximum value of the column in a group.

    Since

    1.3.0

  105. def max(e: Column): Column

    Aggregate function: returns the maximum value of the expression in a group.

    Aggregate function: returns the maximum value of the expression in a group.

    Since

    1.3.0

  106. def mean(columnName: String): Column

    Aggregate function: returns the average of the values in a group.

    Aggregate function: returns the average of the values in a group. Alias for avg.

    Since

    1.4.0

  107. def mean(e: Column): Column

    Aggregate function: returns the average of the values in a group.

    Aggregate function: returns the average of the values in a group. Alias for avg.

    Since

    1.4.0

  108. def min(columnName: String): Column

    Aggregate function: returns the minimum value of the column in a group.

    Aggregate function: returns the minimum value of the column in a group.

    Since

    1.3.0

  109. def min(e: Column): Column

    Aggregate function: returns the minimum value of the expression in a group.

    Aggregate function: returns the minimum value of the expression in a group.

    Since

    1.3.0

  110. def monotonicallyIncreasingId(): Column

    A column expression that generates monotonically increasing 64-bit integers.

    A column expression that generates monotonically increasing 64-bit integers.

    The generated ID is guaranteed to be monotonically increasing and unique, but not consecutive. The current implementation puts the partition ID in the upper 31 bits, and the record number within each partition in the lower 33 bits. The assumption is that the data frame has less than 1 billion partitions, and each partition has less than 8 billion records.

    As an example, consider a DataFrame with two partitions, each with 3 records. This expression would return the following IDs: 0, 1, 2, 8589934592 (1L << 33), 8589934593, 8589934594.

    Since

    1.4.0

  111. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  112. def negate(e: Column): Column

    Unary minus, i.

    Unary minus, i.e. negate the expression.

    // Select the amount column and negates all values.
    // Scala:
    df.select( -df("amount") )
    
    // Java:
    df.select( negate(df.col("amount")) );
    Since

    1.3.0

  113. def not(e: Column): Column

    Inversion of boolean expression, i.

    Inversion of boolean expression, i.e. NOT.

    // Scala: select rows that are not active (isActive === false)
    df.filter( !df("isActive") )
    
    // Java:
    df.filter( not(df.col("isActive")) );
    Since

    1.3.0

  114. final def notify(): Unit

    Definition Classes
    AnyRef
  115. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  116. def ntile(n: Int): Column

    Window function: returns the ntile group id (from 1 to n inclusive) in an ordered window partition.

    Window function: returns the ntile group id (from 1 to n inclusive) in an ordered window partition. Fow example, if n is 4, the first quarter of the rows will get value 1, the second quarter will get 2, the third quarter will get 3, and the last quarter will get 4.

    This is equivalent to the NTILE function in SQL.

    Since

    1.4.0

  117. def percentRank(): Column

    Window function: returns the relative rank (i.

    Window function: returns the relative rank (i.e. percentile) of rows within a window partition.

    This is computed by:

    (rank of row in its partition - 1) / (number of rows in the partition - 1)

    This is equivalent to the PERCENT_RANK function in SQL.

    Since

    1.4.0

  118. def pow(l: Double, rightName: String): Column

    Returns the value of the first argument raised to the power of the second argument.

    Returns the value of the first argument raised to the power of the second argument.

    Since

    1.4.0

  119. def pow(l: Double, r: Column): Column

    Returns the value of the first argument raised to the power of the second argument.

    Returns the value of the first argument raised to the power of the second argument.

    Since

    1.4.0

  120. def pow(leftName: String, r: Double): Column

    Returns the value of the first argument raised to the power of the second argument.

    Returns the value of the first argument raised to the power of the second argument.

    Since

    1.4.0

  121. def pow(l: Column, r: Double): Column

    Returns the value of the first argument raised to the power of the second argument.

    Returns the value of the first argument raised to the power of the second argument.

    Since

    1.4.0

  122. def pow(leftName: String, rightName: String): Column

    Returns the value of the first argument raised to the power of the second argument.

    Returns the value of the first argument raised to the power of the second argument.

    Since

    1.4.0

  123. def pow(leftName: String, r: Column): Column

    Returns the value of the first argument raised to the power of the second argument.

    Returns the value of the first argument raised to the power of the second argument.

    Since

    1.4.0

  124. def pow(l: Column, rightName: String): Column

    Returns the value of the first argument raised to the power of the second argument.

    Returns the value of the first argument raised to the power of the second argument.

    Since

    1.4.0

  125. def pow(l: Column, r: Column): Column

    Returns the value of the first argument raised to the power of the second argument.

    Returns the value of the first argument raised to the power of the second argument.

    Since

    1.4.0

  126. def rand(): Column

    Generate a random column with i.

    Generate a random column with i.i.d. samples from U[0.0, 1.0].

    Since

    1.4.0

  127. def rand(seed: Long): Column

    Generate a random column with i.

    Generate a random column with i.i.d. samples from U[0.0, 1.0].

    Since

    1.4.0

  128. def randn(): Column

    Generate a column with i.

    Generate a column with i.i.d. samples from the standard normal distribution.

    Since

    1.4.0

  129. def randn(seed: Long): Column

    Generate a column with i.

    Generate a column with i.i.d. samples from the standard normal distribution.

    Since

    1.4.0

  130. def rank(): Column

    Window function: returns the rank of rows within a window partition.

    Window function: returns the rank of rows within a window partition.

    The difference between rank and denseRank is that denseRank leaves no gaps in ranking sequence when there are ties. That is, if you were ranking a competition using denseRank and had three people tie for second place, you would say that all three were in second place and that the next person came in third.

    This is equivalent to the RANK function in SQL.

    Since

    1.4.0

  131. def rint(columnName: String): Column

    Returns the double value that is closest in value to the argument and is equal to a mathematical integer.

    Returns the double value that is closest in value to the argument and is equal to a mathematical integer.

    Since

    1.4.0

  132. def rint(e: Column): Column

    Returns the double value that is closest in value to the argument and is equal to a mathematical integer.

    Returns the double value that is closest in value to the argument and is equal to a mathematical integer.

    Since

    1.4.0

  133. def rowNumber(): Column

    Window function: returns a sequential number starting at 1 within a window partition.

    Window function: returns a sequential number starting at 1 within a window partition.

    This is equivalent to the ROW_NUMBER function in SQL.

    Since

    1.4.0

  134. def signum(columnName: String): Column

    Computes the signum of the given column.

    Computes the signum of the given column.

    Since

    1.4.0

  135. def signum(e: Column): Column

    Computes the signum of the given value.

    Computes the signum of the given value.

    Since

    1.4.0

  136. def sin(columnName: String): Column

    Computes the sine of the given column.

    Computes the sine of the given column.

    Since

    1.4.0

  137. def sin(e: Column): Column

    Computes the sine of the given value.

    Computes the sine of the given value.

    Since

    1.4.0

  138. def sinh(columnName: String): Column

    Computes the hyperbolic sine of the given column.

    Computes the hyperbolic sine of the given column.

    Since

    1.4.0

  139. def sinh(e: Column): Column

    Computes the hyperbolic sine of the given value.

    Computes the hyperbolic sine of the given value.

    Since

    1.4.0

  140. def sparkPartitionId(): Column

    Partition ID of the Spark task.

    Partition ID of the Spark task.

    Note that this is indeterministic because it depends on data partitioning and task scheduling.

    Since

    1.4.0

  141. def sqrt(e: Column): Column

    Computes the square root of the specified float value.

    Computes the square root of the specified float value.

    Since

    1.3.0

  142. def struct(colName: String, colNames: String*): Column

    Creates a new struct column that composes multiple input columns.

    Creates a new struct column that composes multiple input columns.

    Since

    1.4.0

  143. def struct(cols: Column*): Column

    Creates a new struct column.

    Creates a new struct column. The input column must be a column in a DataFrame, or a derived column expression that is named (i.e. aliased).

    Annotations
    @varargs()
    Since

    1.4.0

  144. def sum(columnName: String): Column

    Aggregate function: returns the sum of all values in the given column.

    Aggregate function: returns the sum of all values in the given column.

    Since

    1.3.0

  145. def sum(e: Column): Column

    Aggregate function: returns the sum of all values in the expression.

    Aggregate function: returns the sum of all values in the expression.

    Since

    1.3.0

  146. def sumDistinct(columnName: String): Column

    Aggregate function: returns the sum of distinct values in the expression.

    Aggregate function: returns the sum of distinct values in the expression.

    Since

    1.3.0

  147. def sumDistinct(e: Column): Column

    Aggregate function: returns the sum of distinct values in the expression.

    Aggregate function: returns the sum of distinct values in the expression.

    Since

    1.3.0

  148. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  149. def tan(columnName: String): Column

    Computes the tangent of the given column.

    Computes the tangent of the given column.

    Since

    1.4.0

  150. def tan(e: Column): Column

    Computes the tangent of the given value.

    Computes the tangent of the given value.

    Since

    1.4.0

  151. def tanh(columnName: String): Column

    Computes the hyperbolic tangent of the given column.

    Computes the hyperbolic tangent of the given column.

    Since

    1.4.0

  152. def tanh(e: Column): Column

    Computes the hyperbolic tangent of the given value.

    Computes the hyperbolic tangent of the given value.

    Since

    1.4.0

  153. def toDegrees(columnName: String): Column

    Converts an angle measured in radians to an approximately equivalent angle measured in degrees.

    Converts an angle measured in radians to an approximately equivalent angle measured in degrees.

    Since

    1.4.0

  154. def toDegrees(e: Column): Column

    Converts an angle measured in radians to an approximately equivalent angle measured in degrees.

    Converts an angle measured in radians to an approximately equivalent angle measured in degrees.

    Since

    1.4.0

  155. def toRadians(columnName: String): Column

    Converts an angle measured in degrees to an approximately equivalent angle measured in radians.

    Converts an angle measured in degrees to an approximately equivalent angle measured in radians.

    Since

    1.4.0

  156. def toRadians(e: Column): Column

    Converts an angle measured in degrees to an approximately equivalent angle measured in radians.

    Converts an angle measured in degrees to an approximately equivalent angle measured in radians.

    Since

    1.4.0

  157. def toString(): String

    Definition Classes
    AnyRef → Any
  158. def udf[RT, A1, A2, A3, A4, A5, A6, A7, A8, A9, A10](f: (A1, A2, A3, A4, A5, A6, A7, A8, A9, A10) ⇒ RT)(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[RT], arg1: scala.reflect.api.JavaUniverse.TypeTag[A1], arg2: scala.reflect.api.JavaUniverse.TypeTag[A2], arg3: scala.reflect.api.JavaUniverse.TypeTag[A3], arg4: scala.reflect.api.JavaUniverse.TypeTag[A4], arg5: scala.reflect.api.JavaUniverse.TypeTag[A5], arg6: scala.reflect.api.JavaUniverse.TypeTag[A6], arg7: scala.reflect.api.JavaUniverse.TypeTag[A7], arg8: scala.reflect.api.JavaUniverse.TypeTag[A8], arg9: scala.reflect.api.JavaUniverse.TypeTag[A9], arg10: scala.reflect.api.JavaUniverse.TypeTag[A10]): UserDefinedFunction

    Defines a user-defined function of 10 arguments as user-defined function (UDF).

    Defines a user-defined function of 10 arguments as user-defined function (UDF). The data types are automatically inferred based on the function's signature.

    Since

    1.3.0

  159. def udf[RT, A1, A2, A3, A4, A5, A6, A7, A8, A9](f: (A1, A2, A3, A4, A5, A6, A7, A8, A9) ⇒ RT)(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[RT], arg1: scala.reflect.api.JavaUniverse.TypeTag[A1], arg2: scala.reflect.api.JavaUniverse.TypeTag[A2], arg3: scala.reflect.api.JavaUniverse.TypeTag[A3], arg4: scala.reflect.api.JavaUniverse.TypeTag[A4], arg5: scala.reflect.api.JavaUniverse.TypeTag[A5], arg6: scala.reflect.api.JavaUniverse.TypeTag[A6], arg7: scala.reflect.api.JavaUniverse.TypeTag[A7], arg8: scala.reflect.api.JavaUniverse.TypeTag[A8], arg9: scala.reflect.api.JavaUniverse.TypeTag[A9]): UserDefinedFunction

    Defines a user-defined function of 9 arguments as user-defined function (UDF).

    Defines a user-defined function of 9 arguments as user-defined function (UDF). The data types are automatically inferred based on the function's signature.

    Since

    1.3.0

  160. def udf[RT, A1, A2, A3, A4, A5, A6, A7, A8](f: (A1, A2, A3, A4, A5, A6, A7, A8) ⇒ RT)(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[RT], arg1: scala.reflect.api.JavaUniverse.TypeTag[A1], arg2: scala.reflect.api.JavaUniverse.TypeTag[A2], arg3: scala.reflect.api.JavaUniverse.TypeTag[A3], arg4: scala.reflect.api.JavaUniverse.TypeTag[A4], arg5: scala.reflect.api.JavaUniverse.TypeTag[A5], arg6: scala.reflect.api.JavaUniverse.TypeTag[A6], arg7: scala.reflect.api.JavaUniverse.TypeTag[A7], arg8: scala.reflect.api.JavaUniverse.TypeTag[A8]): UserDefinedFunction

    Defines a user-defined function of 8 arguments as user-defined function (UDF).

    Defines a user-defined function of 8 arguments as user-defined function (UDF). The data types are automatically inferred based on the function's signature.

    Since

    1.3.0

  161. def udf[RT, A1, A2, A3, A4, A5, A6, A7](f: (A1, A2, A3, A4, A5, A6, A7) ⇒ RT)(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[RT], arg1: scala.reflect.api.JavaUniverse.TypeTag[A1], arg2: scala.reflect.api.JavaUniverse.TypeTag[A2], arg3: scala.reflect.api.JavaUniverse.TypeTag[A3], arg4: scala.reflect.api.JavaUniverse.TypeTag[A4], arg5: scala.reflect.api.JavaUniverse.TypeTag[A5], arg6: scala.reflect.api.JavaUniverse.TypeTag[A6], arg7: scala.reflect.api.JavaUniverse.TypeTag[A7]): UserDefinedFunction

    Defines a user-defined function of 7 arguments as user-defined function (UDF).

    Defines a user-defined function of 7 arguments as user-defined function (UDF). The data types are automatically inferred based on the function's signature.

    Since

    1.3.0

  162. def udf[RT, A1, A2, A3, A4, A5, A6](f: (A1, A2, A3, A4, A5, A6) ⇒ RT)(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[RT], arg1: scala.reflect.api.JavaUniverse.TypeTag[A1], arg2: scala.reflect.api.JavaUniverse.TypeTag[A2], arg3: scala.reflect.api.JavaUniverse.TypeTag[A3], arg4: scala.reflect.api.JavaUniverse.TypeTag[A4], arg5: scala.reflect.api.JavaUniverse.TypeTag[A5], arg6: scala.reflect.api.JavaUniverse.TypeTag[A6]): UserDefinedFunction

    Defines a user-defined function of 6 arguments as user-defined function (UDF).

    Defines a user-defined function of 6 arguments as user-defined function (UDF). The data types are automatically inferred based on the function's signature.

    Since

    1.3.0

  163. def udf[RT, A1, A2, A3, A4, A5](f: (A1, A2, A3, A4, A5) ⇒ RT)(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[RT], arg1: scala.reflect.api.JavaUniverse.TypeTag[A1], arg2: scala.reflect.api.JavaUniverse.TypeTag[A2], arg3: scala.reflect.api.JavaUniverse.TypeTag[A3], arg4: scala.reflect.api.JavaUniverse.TypeTag[A4], arg5: scala.reflect.api.JavaUniverse.TypeTag[A5]): UserDefinedFunction

    Defines a user-defined function of 5 arguments as user-defined function (UDF).

    Defines a user-defined function of 5 arguments as user-defined function (UDF). The data types are automatically inferred based on the function's signature.

    Since

    1.3.0

  164. def udf[RT, A1, A2, A3, A4](f: (A1, A2, A3, A4) ⇒ RT)(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[RT], arg1: scala.reflect.api.JavaUniverse.TypeTag[A1], arg2: scala.reflect.api.JavaUniverse.TypeTag[A2], arg3: scala.reflect.api.JavaUniverse.TypeTag[A3], arg4: scala.reflect.api.JavaUniverse.TypeTag[A4]): UserDefinedFunction

    Defines a user-defined function of 4 arguments as user-defined function (UDF).

    Defines a user-defined function of 4 arguments as user-defined function (UDF). The data types are automatically inferred based on the function's signature.

    Since

    1.3.0

  165. def udf[RT, A1, A2, A3](f: (A1, A2, A3) ⇒ RT)(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[RT], arg1: scala.reflect.api.JavaUniverse.TypeTag[A1], arg2: scala.reflect.api.JavaUniverse.TypeTag[A2], arg3: scala.reflect.api.JavaUniverse.TypeTag[A3]): UserDefinedFunction

    Defines a user-defined function of 3 arguments as user-defined function (UDF).

    Defines a user-defined function of 3 arguments as user-defined function (UDF). The data types are automatically inferred based on the function's signature.

    Since

    1.3.0

  166. def udf[RT, A1, A2](f: (A1, A2) ⇒ RT)(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[RT], arg1: scala.reflect.api.JavaUniverse.TypeTag[A1], arg2: scala.reflect.api.JavaUniverse.TypeTag[A2]): UserDefinedFunction

    Defines a user-defined function of 2 arguments as user-defined function (UDF).

    Defines a user-defined function of 2 arguments as user-defined function (UDF). The data types are automatically inferred based on the function's signature.

    Since

    1.3.0

  167. def udf[RT, A1](f: (A1) ⇒ RT)(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[RT], arg1: scala.reflect.api.JavaUniverse.TypeTag[A1]): UserDefinedFunction

    Defines a user-defined function of 1 arguments as user-defined function (UDF).

    Defines a user-defined function of 1 arguments as user-defined function (UDF). The data types are automatically inferred based on the function's signature.

    Since

    1.3.0

  168. def udf[RT](f: () ⇒ RT)(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[RT]): UserDefinedFunction

    Defines a user-defined function of 0 arguments as user-defined function (UDF).

    Defines a user-defined function of 0 arguments as user-defined function (UDF). The data types are automatically inferred based on the function's signature.

    Since

    1.3.0

  169. def upper(e: Column): Column

    Converts a string expression to upper case.

    Converts a string expression to upper case.

    Since

    1.3.0

  170. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  171. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  172. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  173. def when(condition: Column, value: Any): Column

    Evaluates a list of conditions and returns one of multiple possible result expressions.

    Evaluates a list of conditions and returns one of multiple possible result expressions. If otherwise is not defined at the end, null is returned for unmatched conditions.

    // Example: encoding gender string column into integer.
    
    // Scala:
    people.select(when(people("gender") === "male", 0)
      .when(people("gender") === "female", 1)
      .otherwise(2))
    
    // Java:
    people.select(when(col("gender").equalTo("male"), 0)
      .when(col("gender").equalTo("female"), 1)
      .otherwise(2))
    Since

    1.4.0

Inherited from AnyRef

Inherited from Any

Aggregate functions

Math functions

Non-aggregate functions

Sorting functions

UDF functions

Window functions

Ungrouped