org.apache.spark.sql
Class Column

Object
  extended by org.apache.spark.sql.Column
All Implemented Interfaces:
Logging
Direct Known Subclasses:
ColumnName

public class Column
extends Object
implements Logging

:: Experimental :: A column in a DataFrame.

Since:
1.3.0

Constructor Summary
Column(org.apache.spark.sql.catalyst.expressions.Expression expr)
           
Column(String name)
           
 
Method Summary
 Column alias(String alias)
          Gives the column an alias.
 Column and(Column other)
          Boolean AND.
 Column apply(Object extraction)
          Extracts a value or values from a complex type.
 Column as(scala.collection.Seq<String> aliases)
          (Scala-specific) Assigns the given aliases to the results of a table generating function.
 Column as(String alias)
          Gives the column an alias.
 Column as(String[] aliases)
          Assigns the given aliases to the results of a table generating function.
 Column as(String alias, Metadata metadata)
          Gives the column an alias with metadata.
 Column as(scala.Symbol alias)
          Gives the column an alias.
 Column asc()
          Returns an ordering used in sorting.
 Column between(Object lowerBound, Object upperBound)
          True if the current column is between the lower bound and upper bound, inclusive.
 Column bitwiseAND(Object other)
          Compute bitwise AND of this expression with another expression.
 Column bitwiseOR(Object other)
          Compute bitwise OR of this expression with another expression.
 Column bitwiseXOR(Object other)
          Compute bitwise XOR of this expression with another expression.
 Column cast(DataType to)
          Casts the column to a different data type.
 Column cast(String to)
          Casts the column to a different data type, using the canonical string representation of the type.
 Column contains(Object other)
          Contains the other element.
 Column desc()
          Returns an ordering used in sorting.
 Column divide(Object other)
          Division this expression by another expression.
 Column endsWith(Column other)
          String ends with.
 Column endsWith(String literal)
          String ends with another string literal.
 Column eqNullSafe(Object other)
          Equality test that is safe for null values.
 boolean equals(Object that)
           
 Column equalTo(Object other)
          Equality test.
 void explain(boolean extended)
          Prints the expression to the console for debugging purpose.
 Column geq(Object other)
          Greater than or equal to an expression.
 Column getField(String fieldName)
          An expression that gets a field by name in a StructType.
 Column getItem(Object key)
          An expression that gets an item at position ordinal out of an array, or gets a value by key key in a MapType.
 Column gt(Object other)
          Greater than.
 int hashCode()
           
 Column in(Column... list)
          A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.
 Column in(scala.collection.Seq<Column> list)
          A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.
 Column isNotNull()
          True if the current expression is NOT null.
 Column isNull()
          True if the current expression is null.
 Column leq(Object other)
          Less than or equal to.
 Column like(String literal)
          SQL like expression.
 Column lt(Object other)
          Less than.
 Column minus(Object other)
          Subtraction.
 Column mod(Object other)
          Modulo (a.k.a.
 Column multiply(Object other)
          Multiplication of this expression and another expression.
 Column notEqual(Object other)
          Inequality test.
 Column or(Column other)
          Boolean OR.
 Column otherwise(Object value)
          Evaluates a list of conditions and returns one of multiple possible result expressions.
 Column over(WindowSpec window)
          Define a windowing column.
 Column plus(Object other)
          Sum of this expression and another expression.
 Column rlike(String literal)
          SQL RLIKE expression (LIKE with Regex).
 Column startsWith(Column other)
          String starts with.
 Column startsWith(String literal)
          String starts with another string literal.
 Column substr(Column startPos, Column len)
          An expression that returns a substring.
 Column substr(int startPos, int len)
          An expression that returns a substring.
 String toString()
           
static scala.Option<org.apache.spark.sql.catalyst.expressions.Expression> unapply(Column col)
           
 Column when(Column condition, Object value)
          Evaluates a list of conditions and returns one of multiple possible result expressions.
 
Methods inherited from class Object
getClass, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface org.apache.spark.Logging
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
 

Constructor Detail

Column

public Column(org.apache.spark.sql.catalyst.expressions.Expression expr)

Column

public Column(String name)
Method Detail

unapply

public static scala.Option<org.apache.spark.sql.catalyst.expressions.Expression> unapply(Column col)

in

public Column in(Column... list)
A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.

Parameters:
list - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

toString

public String toString()
Overrides:
toString in class Object

equals

public boolean equals(Object that)
Overrides:
equals in class Object

hashCode

public int hashCode()
Overrides:
hashCode in class Object

apply

public Column apply(Object extraction)
Extracts a value or values from a complex type. The following types of extraction are supported: - Given an Array, an integer ordinal can be used to retrieve a single value. - Given a Map, a key of the correct type can be used to retrieve an individual value. - Given a Struct, a string fieldName can be used to extract that field. - Given an Array of Structs, a string fieldName can be used to extract filed of every struct in that array, and return an Array of fields

Parameters:
extraction - (undocumented)
Returns:
(undocumented)
Since:
1.4.0

equalTo

public Column equalTo(Object other)
Equality test.

   // Scala:
   df.filter( df("colA") === df("colB") )

   // Java
   import static org.apache.spark.sql.functions.*;
   df.filter( col("colA").equalTo(col("colB")) );
 

Parameters:
other - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

notEqual

public Column notEqual(Object other)
Inequality test.

   // Scala:
   df.select( df("colA") !== df("colB") )
   df.select( !(df("colA") === df("colB")) )

   // Java:
   import static org.apache.spark.sql.functions.*;
   df.filter( col("colA").notEqual(col("colB")) );
 

Parameters:
other - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

gt

public Column gt(Object other)
Greater than.

   // Scala: The following selects people older than 21.
   people.select( people("age") > lit(21) )

   // Java:
   import static org.apache.spark.sql.functions.*;
   people.select( people("age").gt(21) );
 

Parameters:
other - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

lt

public Column lt(Object other)
Less than.

   // Scala: The following selects people younger than 21.
   people.select( people("age") < 21 )

   // Java:
   people.select( people("age").lt(21) );
 

Parameters:
other - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

leq

public Column leq(Object other)
Less than or equal to.

   // Scala: The following selects people age 21 or younger than 21.
   people.select( people("age") <= 21 )

   // Java:
   people.select( people("age").leq(21) );
 

Parameters:
other - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

geq

public Column geq(Object other)
Greater than or equal to an expression.

   // Scala: The following selects people age 21 or older than 21.
   people.select( people("age") >= 21 )

   // Java:
   people.select( people("age").geq(21) )
 

Parameters:
other - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

eqNullSafe

public Column eqNullSafe(Object other)
Equality test that is safe for null values.

Parameters:
other - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

when

public Column when(Column condition,
                   Object value)
Evaluates a list of conditions and returns one of multiple possible result expressions. If otherwise is not defined at the end, null is returned for unmatched conditions.


   // Example: encoding gender string column into integer.

   // Scala:
   people.select(when(people("gender") === "male", 0)
     .when(people("gender") === "female", 1)
     .otherwise(2))

   // Java:
   people.select(when(col("gender").equalTo("male"), 0)
     .when(col("gender").equalTo("female"), 1)
     .otherwise(2))
 

Parameters:
condition - (undocumented)
value - (undocumented)
Returns:
(undocumented)
Since:
1.4.0

otherwise

public Column otherwise(Object value)
Evaluates a list of conditions and returns one of multiple possible result expressions. If otherwise is not defined at the end, null is returned for unmatched conditions.


   // Example: encoding gender string column into integer.

   // Scala:
   people.select(when(people("gender") === "male", 0)
     .when(people("gender") === "female", 1)
     .otherwise(2))

   // Java:
   people.select(when(col("gender").equalTo("male"), 0)
     .when(col("gender").equalTo("female"), 1)
     .otherwise(2))
 

Parameters:
value - (undocumented)
Returns:
(undocumented)
Since:
1.4.0

between

public Column between(Object lowerBound,
                      Object upperBound)
True if the current column is between the lower bound and upper bound, inclusive.

Parameters:
lowerBound - (undocumented)
upperBound - (undocumented)
Returns:
(undocumented)
Since:
1.4.0

isNull

public Column isNull()
True if the current expression is null.

Returns:
(undocumented)
Since:
1.3.0

isNotNull

public Column isNotNull()
True if the current expression is NOT null.

Returns:
(undocumented)
Since:
1.3.0

or

public Column or(Column other)
Boolean OR.

   // Scala: The following selects people that are in school or employed.
   people.filter( people("inSchool") || people("isEmployed") )

   // Java:
   people.filter( people("inSchool").or(people("isEmployed")) );
 

Parameters:
other - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

and

public Column and(Column other)
Boolean AND.

   // Scala: The following selects people that are in school and employed at the same time.
   people.select( people("inSchool") && people("isEmployed") )

   // Java:
   people.select( people("inSchool").and(people("isEmployed")) );
 

Parameters:
other - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

plus

public Column plus(Object other)
Sum of this expression and another expression.

   // Scala: The following selects the sum of a person's height and weight.
   people.select( people("height") + people("weight") )

   // Java:
   people.select( people("height").plus(people("weight")) );
 

Parameters:
other - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

minus

public Column minus(Object other)
Subtraction. Subtract the other expression from this expression.

   // Scala: The following selects the difference between people's height and their weight.
   people.select( people("height") - people("weight") )

   // Java:
   people.select( people("height").minus(people("weight")) );
 

Parameters:
other - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

multiply

public Column multiply(Object other)
Multiplication of this expression and another expression.

   // Scala: The following multiplies a person's height by their weight.
   people.select( people("height") * people("weight") )

   // Java:
   people.select( people("height").multiply(people("weight")) );
 

Parameters:
other - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

divide

public Column divide(Object other)
Division this expression by another expression.

   // Scala: The following divides a person's height by their weight.
   people.select( people("height") / people("weight") )

   // Java:
   people.select( people("height").divide(people("weight")) );
 

Parameters:
other - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

mod

public Column mod(Object other)
Modulo (a.k.a. remainder) expression.

Parameters:
other - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

in

public Column in(scala.collection.Seq<Column> list)
A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.

Parameters:
list - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

like

public Column like(String literal)
SQL like expression.

Parameters:
literal - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

rlike

public Column rlike(String literal)
SQL RLIKE expression (LIKE with Regex).

Parameters:
literal - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

getItem

public Column getItem(Object key)
An expression that gets an item at position ordinal out of an array, or gets a value by key key in a MapType.

Parameters:
key - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

getField

public Column getField(String fieldName)
An expression that gets a field by name in a StructType.

Parameters:
fieldName - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

substr

public Column substr(Column startPos,
                     Column len)
An expression that returns a substring.

Parameters:
startPos - expression for the starting position.
len - expression for the length of the substring.

Returns:
(undocumented)
Since:
1.3.0

substr

public Column substr(int startPos,
                     int len)
An expression that returns a substring.

Parameters:
startPos - starting position.
len - length of the substring.

Returns:
(undocumented)
Since:
1.3.0

contains

public Column contains(Object other)
Contains the other element.

Parameters:
other - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

startsWith

public Column startsWith(Column other)
String starts with.

Parameters:
other - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

startsWith

public Column startsWith(String literal)
String starts with another string literal.

Parameters:
literal - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

endsWith

public Column endsWith(Column other)
String ends with.

Parameters:
other - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

endsWith

public Column endsWith(String literal)
String ends with another string literal.

Parameters:
literal - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

alias

public Column alias(String alias)
Gives the column an alias. Same as as.

   // Renames colA to colB in select output.
   df.select($"colA".alias("colB"))
 

Parameters:
alias - (undocumented)
Returns:
(undocumented)
Since:
1.4.0

as

public Column as(String alias)
Gives the column an alias.

   // Renames colA to colB in select output.
   df.select($"colA".as("colB"))
 

Parameters:
alias - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

as

public Column as(scala.collection.Seq<String> aliases)
(Scala-specific) Assigns the given aliases to the results of a table generating function.

   // Renames colA to colB in select output.
   df.select(explode($"myMap").as("key" :: "value" :: Nil))
 

Parameters:
aliases - (undocumented)
Returns:
(undocumented)
Since:
1.4.0

as

public Column as(String[] aliases)
Assigns the given aliases to the results of a table generating function.

   // Renames colA to colB in select output.
   df.select(explode($"myMap").as("key" :: "value" :: Nil))
 

Parameters:
aliases - (undocumented)
Returns:
(undocumented)
Since:
1.4.0

as

public Column as(scala.Symbol alias)
Gives the column an alias.

   // Renames colA to colB in select output.
   df.select($"colA".as('colB))
 

Parameters:
alias - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

as

public Column as(String alias,
                 Metadata metadata)
Gives the column an alias with metadata.

   val metadata: Metadata = ...
   df.select($"colA".as("colB", metadata))
 

Parameters:
alias - (undocumented)
metadata - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

cast

public Column cast(DataType to)
Casts the column to a different data type.

   // Casts colA to IntegerType.
   import org.apache.spark.sql.types.IntegerType
   df.select(df("colA").cast(IntegerType))

   // equivalent to
   df.select(df("colA").cast("int"))
 

Parameters:
to - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

cast

public Column cast(String to)
Casts the column to a different data type, using the canonical string representation of the type. The supported types are: string, boolean, byte, short, int, long, float, double, decimal, date, timestamp.

   // Casts colA to integer.
   df.select(df("colA").cast("int"))
 

Parameters:
to - (undocumented)
Returns:
(undocumented)
Since:
1.3.0

desc

public Column desc()
Returns an ordering used in sorting.

   // Scala: sort a DataFrame by age column in descending order.
   df.sort(df("age").desc)

   // Java
   df.sort(df.col("age").desc());
 

Returns:
(undocumented)
Since:
1.3.0

asc

public Column asc()
Returns an ordering used in sorting.

   // Scala: sort a DataFrame by age column in ascending order.
   df.sort(df("age").asc)

   // Java
   df.sort(df.col("age").asc());
 

Returns:
(undocumented)
Since:
1.3.0

explain

public void explain(boolean extended)
Prints the expression to the console for debugging purpose.

Parameters:
extended - (undocumented)
Since:
1.3.0

bitwiseOR

public Column bitwiseOR(Object other)
Compute bitwise OR of this expression with another expression.

   df.select($"colA".bitwiseOR($"colB"))
 

Parameters:
other - (undocumented)
Returns:
(undocumented)
Since:
1.4.0

bitwiseAND

public Column bitwiseAND(Object other)
Compute bitwise AND of this expression with another expression.

   df.select($"colA".bitwiseAND($"colB"))
 

Parameters:
other - (undocumented)
Returns:
(undocumented)
Since:
1.4.0

bitwiseXOR

public Column bitwiseXOR(Object other)
Compute bitwise XOR of this expression with another expression.

   df.select($"colA".bitwiseXOR($"colB"))
 

Parameters:
other - (undocumented)
Returns:
(undocumented)
Since:
1.4.0

over

public Column over(WindowSpec window)
Define a windowing column.


   val w = Window.partitionBy("name").orderBy("id")
   df.select(
     sum("price").over(w.rangeBetween(Long.MinValue, 2)),
     avg("price").over(w.rowsBetween(0, 4))
   )
 

Parameters:
window - (undocumented)
Returns:
(undocumented)
Since:
1.4.0