org.apache.spark.sql.Column

All Implemented Interfaces:: org.apache.spark.internal.Logging, org.apache.spark.sql.internal.TableValuedFunctionArgument

Direct Known Subclasses:: ColumnName, TypedColumn

public class Column extends Object implements org.apache.spark.internal.Logging, org.apache.spark.sql.internal.TableValuedFunctionArgument

A column that will be computed based on the data in a DataFrame.

A new column can be constructed based on the input columns present in a DataFrame:


   df("columnName")            // On a specific `df` DataFrame.
   col("columnName")           // A generic column not yet associated with a DataFrame.
   col("columnName.field")     // Extracting a struct field
   col("`a.column.with.dots`") // Escape `.` in column names.
   $"columnName"               // Scala short hand for a named column.

Column objects can be composed to form complex expressions:


   $"a" + 1
   $"a" === $"b"

Since:: 1.3.0

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter
Constructor Summary

Constructors

Constructor

Description

Column(String name)

Column(org.apache.spark.sql.internal.ColumnNode node)
Method Summary

Modifier and Type

Method

Description

Column

alias(String alias)

Gives the column an alias.

Column

and(Column other)

Boolean AND.

Column

apply(Object extraction)

Extracts a value or values from a complex type.

Column

as(String alias)

Gives the column an alias.

Column

as(String[] aliases)

Assigns the given aliases to the results of a table generating function.

Column

as(String alias, Metadata metadata)

Gives the column an alias with metadata.

<U> TypedColumn<Object,U>

as(Encoder<U> evidence$1)

Provides a type hint about the expected return value of this column.

Column

as(scala.collection.immutable.Seq<String> aliases)

(Scala-specific) Assigns the given aliases to the results of a table generating function.

Column

as(scala.Symbol alias)

Gives the column an alias.

Column

asc()

Returns a sort expression based on ascending order of the column.

Column

asc_nulls_first()

Returns a sort expression based on ascending order of the column, and null values return before non-null values.

Column

asc_nulls_last()

Returns a sort expression based on ascending order of the column, and null values appear after non-null values.

Column

between(Object lowerBound, Object upperBound)

True if the current column is between the lower bound and upper bound, inclusive.

Column

bitwiseAND(Object other)

Compute bitwise AND of this expression with another expression.

Column

bitwiseOR(Object other)

Compute bitwise OR of this expression with another expression.

Column

bitwiseXOR(Object other)

Compute bitwise XOR of this expression with another expression.

Column

cast(String to)

Casts the column to a different data type, using the canonical string representation of the type.

Column

cast(DataType to)

Casts the column to a different data type.

Column

contains(Object other)

Contains the other element.

Column

desc()

Returns a sort expression based on the descending order of the column.

Column

desc_nulls_first()

Returns a sort expression based on the descending order of the column, and null values appear before non-null values.

Column

desc_nulls_last()

Returns a sort expression based on the descending order of the column, and null values appear after non-null values.

Column

divide(Object other)

Division this expression by another expression.

Column

dropFields(scala.collection.immutable.Seq<String> fieldNames)

An expression that drops fields in StructType by name.

Column

endsWith(String literal)

String ends with another string literal.

Column

endsWith(Column other)

String ends with.

Column

eqNullSafe(Object other)

Equality test that is safe for null values.

boolean

equals(Object that)

Column

equalTo(Object other)

Equality test.

void

explain(boolean extended)

Prints the expression to the console for debugging purposes.

Column

geq(Object other)

Greater than or equal to an expression.

Column

getField(String fieldName)

An expression that gets a field by name in a StructType.

Column

getItem(Object key)

An expression that gets an item at position ordinal out of an array, or gets a value by key key in a MapType.

Column

gt(Object other)

Greater than.

int

hashCode()

Column

ilike(String literal)

SQL ILIKE expression (case insensitive LIKE).

Column

isin(Object... list)

A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.

Column

isin(Dataset<?> ds)

A boolean expression that is evaluated to true if the value of this expression is contained by the provided Dataset/DataFrame.

Column

isin(scala.collection.immutable.Seq<Object> list)

A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.

Column

isInCollection(Iterable<?> values)

A boolean expression that is evaluated to true if the value of this expression is contained by the provided collection.

Column

isInCollection(scala.collection.Iterable<?> values)

A boolean expression that is evaluated to true if the value of this expression is contained by the provided collection.

Column

isNaN()

True if the current expression is NaN.

Column

isNotNull()

True if the current expression is NOT null.

Column

isNull()

True if the current expression is null.

Column

leq(Object other)

Less than or equal to.

Column

like(String literal)

SQL like expression.

Column

lt(Object other)

Less than.

Column

minus(Object other)

Subtraction.

Column

mod(Object other)

Modulo (a.k.a.

Column

multiply(Object other)

Multiplication of this expression and another expression.

Column

name(String alias)

Gives the column a name (alias).

org.apache.spark.sql.internal.ColumnNode

node()

Column

notEqual(Object other)

Inequality test.

Column

or(Column other)

Boolean OR.

Column

otherwise(Object value)

Evaluates a list of conditions and returns one of multiple possible result expressions.

Column

outer()

Mark this column as an outer column if its expression refers to columns from an outer query.

Column

over()

Defines an empty analytic clause.

Column

over(WindowSpec window)

Defines a windowing column.

Column

plus(Object other)

Sum of this expression and another expression.

Column

rlike(String literal)

SQL RLIKE expression (LIKE with Regex).

Column

startsWith(String literal)

String starts with another string literal.

Column

startsWith(Column other)

String starts with.

Column

substr(int startPos, int len)

An expression that returns a substring.

Column

substr(Column startPos, Column len)

An expression that returns a substring.

String

toString()

Column

transform(scala.Function1<Column,Column> f)

Concise syntax for chaining custom transformations.

Column

try_cast(String to)

Casts the column to a different data type and the result is null on failure.

Column

try_cast(DataType to)

Casts the column to a different data type and the result is null on failure.

Column

when(Column condition, Object value)

Evaluates a list of conditions and returns one of multiple possible result expressions.

Column

withField(String fieldName, Column col)

An expression that adds/replaces field in StructType by name.

Methods inherited from class java.lang.Object
getClass, notify, notifyAll, wait, wait, wait

Methods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logBasedOnLevel, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, MDC, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext

Constructor Details
- Column
  
  public Column(org.apache.spark.sql.internal.ColumnNode node)
- Column
  
  public Column(String name)
Method Details
- isin
  
  public Column isin(Object... list)
  
  A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.
  Note: Since the type of the elements in the list are inferred only during the run time, the elements will be "up-casted" to the most common type for comparison. For eg: 1) In the case of "Int vs String", the "Int" will be up-casted to "String" and the comparison will look like "String vs String". 2) In the case of "Float vs Double", the "Float" will be up-casted to "Double" and the comparison will look like "Double vs Double"
  
  Parameters:
  
  list - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- node
  
  public org.apache.spark.sql.internal.ColumnNode node()
- toString
  
  public String toString()
  
  Overrides:
  
  toString in class Object
- equals
  
  public boolean equals(Object that)
  
  Overrides:
  
  equals in class Object
- hashCode
  
  public int hashCode()
  
  Overrides:
  
  hashCode in class Object
- as
  
  public <U> TypedColumn<Object,U> as(Encoder<U> evidence$1)
  
  Provides a type hint about the expected return value of this column. This information can be used by operations such as select on a Dataset to automatically convert the results into the correct JVM types.
  
  Parameters:
  
  evidence$1 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.6.0
- apply
  
  public Column apply(Object extraction)
  Extracts a value or values from a complex type. The following types of extraction are supported:
  
  Given an Array, an integer ordinal can be used to retrieve a single value.
  
  Given a Map, a key of the correct type can be used to retrieve an individual value.
  
  Given a Struct, a string fieldName can be used to extract that field.
  
  Given an Array of Structs, a string fieldName can be used to extract filed of every struct in that array, and return an Array of fields.
  Parameters:
  
  extraction - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- equalTo
  
  public Column equalTo(Object other)
  Equality test.
  // Scala: df.filter( df("colA") === df("colB") ) // Java import static org.apache.spark.sql.functions.*; df.filter( col("colA").equalTo(col("colB")) );
  Parameters:
  
  other - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- notEqual
  
  public Column notEqual(Object other)
  Inequality test.
  // Scala: df.select( df("colA") !== df("colB") ) df.select( !(df("colA") === df("colB")) ) // Java: import static org.apache.spark.sql.functions.*; df.filter( col("colA").notEqual(col("colB")) );
  Parameters:
  
  other - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- gt
  
  public Column gt(Object other)
  Greater than.
  // Scala: The following selects people older than 21. people.select( people("age") > lit(21) ) // Java: import static org.apache.spark.sql.functions.*; people.select( people.col("age").gt(21) );
  Parameters:
  
  other - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- lt
  
  public Column lt(Object other)
  Less than.
  // Scala: The following selects people younger than 21. people.select( people("age") < 21 ) // Java: people.select( people.col("age").lt(21) );
  Parameters:
  
  other - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- leq
  
  public Column leq(Object other)
  Less than or equal to.
  // Scala: The following selects people age 21 or younger than 21. people.select( people("age") <= 21 ) // Java: people.select( people.col("age").leq(21) );
  Parameters:
  
  other - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- geq
  
  public Column geq(Object other)
  Greater than or equal to an expression.
  // Scala: The following selects people age 21 or older than 21. people.select( people("age") >= 21 ) // Java: people.select( people.col("age").geq(21) )
  Parameters:
  
  other - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- eqNullSafe
  
  public Column eqNullSafe(Object other)
  
  Equality test that is safe for null values.
  
  Parameters:
  
  other - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- when
  
  public Column when(Column condition, Object value)
  Evaluates a list of conditions and returns one of multiple possible result expressions. If otherwise is not defined at the end, null is returned for unmatched conditions.
  
  // Example: encoding gender string column into integer. // Scala: people.select(when(people("gender") === "male", 0) .when(people("gender") === "female", 1) .otherwise(2)) // Java: people.select(when(col("gender").equalTo("male"), 0) .when(col("gender").equalTo("female"), 1) .otherwise(2))
  Parameters:
  
  condition - (undocumented)
  
  value - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- otherwise
  
  public Column otherwise(Object value)
  Evaluates a list of conditions and returns one of multiple possible result expressions. If otherwise is not defined at the end, null is returned for unmatched conditions.
  
  // Example: encoding gender string column into integer. // Scala: people.select(when(people("gender") === "male", 0) .when(people("gender") === "female", 1) .otherwise(2)) // Java: people.select(when(col("gender").equalTo("male"), 0) .when(col("gender").equalTo("female"), 1) .otherwise(2))
  Parameters:
  
  value - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- between
  
  public Column between(Object lowerBound, Object upperBound)
  
  True if the current column is between the lower bound and upper bound, inclusive.
  
  Parameters:
  
  lowerBound - (undocumented)
  
  upperBound - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- isNaN
  
  public Column isNaN()
  
  True if the current expression is NaN.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- isNull
  
  public Column isNull()
  
  True if the current expression is null.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- isNotNull
  
  public Column isNotNull()
  
  True if the current expression is NOT null.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- or
  
  public Column or(Column other)
  Boolean OR.
  // Scala: The following selects people that are in school or employed. people.filter( people("inSchool") || people("isEmployed") ) // Java: people.filter( people.col("inSchool").or(people.col("isEmployed")) );
  Parameters:
  
  other - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- and
  
  public Column and(Column other)
  Boolean AND.
  // Scala: The following selects people that are in school and employed at the same time. people.select( people("inSchool") && people("isEmployed") ) // Java: people.select( people.col("inSchool").and(people.col("isEmployed")) );
  Parameters:
  
  other - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- plus
  
  public Column plus(Object other)
  Sum of this expression and another expression.
  // Scala: The following selects the sum of a person's height and weight. people.select( people("height") + people("weight") ) // Java: people.select( people.col("height").plus(people.col("weight")) );
  Parameters:
  
  other - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- minus
  
  public Column minus(Object other)
  Subtraction. Subtract the other expression from this expression.
  // Scala: The following selects the difference between people's height and their weight. people.select( people("height") - people("weight") ) // Java: people.select( people.col("height").minus(people.col("weight")) );
  Parameters:
  
  other - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- multiply
  
  public Column multiply(Object other)
  Multiplication of this expression and another expression.
  // Scala: The following multiplies a person's height by their weight. people.select( people("height") * people("weight") ) // Java: people.select( people.col("height").multiply(people.col("weight")) );
  Parameters:
  
  other - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- divide
  
  public Column divide(Object other)
  Division this expression by another expression.
  // Scala: The following divides a person's height by their weight. people.select( people("height") / people("weight") ) // Java: people.select( people.col("height").divide(people.col("weight")) );
  Parameters:
  
  other - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- mod
  
  public Column mod(Object other)
  
  Modulo (a.k.a. remainder) expression.
  
  Parameters:
  
  other - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- isin
  
  public Column isin(scala.collection.immutable.Seq<Object> list)
  
  A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.
  Note: Since the type of the elements in the list are inferred only during the run time, the elements will be "up-casted" to the most common type for comparison. For eg: 1) In the case of "Int vs String", the "Int" will be up-casted to "String" and the comparison will look like "String vs String". 2) In the case of "Float vs Double", the "Float" will be up-casted to "Double" and the comparison will look like "Double vs Double"
  
  Parameters:
  
  list - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.5.0
- isInCollection
  
  public Column isInCollection(scala.collection.Iterable<?> values)
  
  A boolean expression that is evaluated to true if the value of this expression is contained by the provided collection.
  Note: Since the type of the elements in the collection are inferred only during the run time, the elements will be "up-casted" to the most common type for comparison. For eg: 1) In the case of "Int vs String", the "Int" will be up-casted to "String" and the comparison will look like "String vs String". 2) In the case of "Float vs Double", the "Float" will be up-casted to "Double" and the comparison will look like "Double vs Double"
  
  Parameters:
  
  values - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- isInCollection
  
  public Column isInCollection(Iterable<?> values)
  
  A boolean expression that is evaluated to true if the value of this expression is contained by the provided collection.
  Note: Since the type of the elements in the collection are inferred only during the run time, the elements will be "up-casted" to the most common type for comparison. For eg: 1) In the case of "Int vs String", the "Int" will be up-casted to "String" and the comparison will look like "String vs String". 2) In the case of "Float vs Double", the "Float" will be up-casted to "Double" and the comparison will look like "Double vs Double"
  
  Parameters:
  
  values - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.4.0
- isin
  
  public Column isin(Dataset<?> ds)
  
  A boolean expression that is evaluated to true if the value of this expression is contained by the provided Dataset/DataFrame.
  
  Parameters:
  
  ds - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.1.0
- like
  
  public Column like(String literal)
  
  SQL like expression. Returns a boolean column based on a SQL LIKE match.
  
  Parameters:
  
  literal - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- rlike
  
  public Column rlike(String literal)
  
  SQL RLIKE expression (LIKE with Regex). Returns a boolean column based on a regex match.
  
  Parameters:
  
  literal - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- ilike
  
  public Column ilike(String literal)
  
  SQL ILIKE expression (case insensitive LIKE).
  
  Parameters:
  
  literal - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.3.0
- getItem
  
  public Column getItem(Object key)
  
  An expression that gets an item at position ordinal out of an array, or gets a value by key key in a MapType.
  
  Parameters:
  
  key - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- withField
  
  public Column withField(String fieldName, Column col)
  An expression that adds/replaces field in StructType by name.
  
  val df = sql("SELECT named_struct('a', 1, 'b', 2) struct_col") df.select($"struct_col".withField("c", lit(3))) // result: {"a":1,"b":2,"c":3} val df = sql("SELECT named_struct('a', 1, 'b', 2) struct_col") df.select($"struct_col".withField("b", lit(3))) // result: {"a":1,"b":3} val df = sql("SELECT CAST(NULL AS struct<a:int,b:int>) struct_col") df.select($"struct_col".withField("c", lit(3))) // result: null of type struct<a:int,b:int,c:int> val df = sql("SELECT named_struct('a', 1, 'b', 2, 'b', 3) struct_col") df.select($"struct_col".withField("b", lit(100))) // result: {"a":1,"b":100,"b":100} val df = sql("SELECT named_struct('a', named_struct('a', 1, 'b', 2)) struct_col") df.select($"struct_col".withField("a.c", lit(3))) // result: {"a":{"a":1,"b":2,"c":3}} val df = sql("SELECT named_struct('a', named_struct('b', 1), 'a', named_struct('c', 2)) struct_col") df.select($"struct_col".withField("a.c", lit(3))) // result: org.apache.spark.sql.AnalysisException: Ambiguous reference to fields
  
  This method supports adding/replacing nested fields directly e.g.
  
  val df = sql("SELECT named_struct('a', named_struct('a', 1, 'b', 2)) struct_col") df.select($"struct_col".withField("a.c", lit(3)).withField("a.d", lit(4))) // result: {"a":{"a":1,"b":2,"c":3,"d":4}}
  
  However, if you are going to add/replace multiple nested fields, it is more optimal to extract out the nested struct before adding/replacing multiple fields e.g.
  
  val df = sql("SELECT named_struct('a', named_struct('a', 1, 'b', 2)) struct_col") df.select($"struct_col".withField("a", $"struct_col.a".withField("c", lit(3)).withField("d", lit(4)))) // result: {"a":{"a":1,"b":2,"c":3,"d":4}}
  Parameters:
  
  fieldName - (undocumented)
  
  col - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.1.0
- dropFields
  
  public Column dropFields(scala.collection.immutable.Seq<String> fieldNames)
  An expression that drops fields in StructType by name. This is a no-op if schema doesn't contain field name(s).
  
  val df = sql("SELECT named_struct('a', 1, 'b', 2) struct_col") df.select($"struct_col".dropFields("b")) // result: {"a":1} val df = sql("SELECT named_struct('a', 1, 'b', 2) struct_col") df.select($"struct_col".dropFields("c")) // result: {"a":1,"b":2} val df = sql("SELECT named_struct('a', 1, 'b', 2, 'c', 3) struct_col") df.select($"struct_col".dropFields("b", "c")) // result: {"a":1} val df = sql("SELECT named_struct('a', 1, 'b', 2) struct_col") df.select($"struct_col".dropFields("a", "b")) // result: org.apache.spark.sql.AnalysisException: [DATATYPE_MISMATCH.CANNOT_DROP_ALL_FIELDS] Cannot resolve "update_fields(struct_col, dropfield(), dropfield())" due to data type mismatch: Cannot drop all fields in struct.; val df = sql("SELECT CAST(NULL AS struct<a:int,b:int>) struct_col") df.select($"struct_col".dropFields("b")) // result: null of type struct<a:int> val df = sql("SELECT named_struct('a', 1, 'b', 2, 'b', 3) struct_col") df.select($"struct_col".dropFields("b")) // result: {"a":1} val df = sql("SELECT named_struct('a', named_struct('a', 1, 'b', 2)) struct_col") df.select($"struct_col".dropFields("a.b")) // result: {"a":{"a":1}} val df = sql("SELECT named_struct('a', named_struct('b', 1), 'a', named_struct('c', 2)) struct_col") df.select($"struct_col".dropFields("a.c")) // result: org.apache.spark.sql.AnalysisException: Ambiguous reference to fields
  
  This method supports dropping multiple nested fields directly e.g.
  
  val df = sql("SELECT named_struct('a', named_struct('a', 1, 'b', 2)) struct_col") df.select($"struct_col".dropFields("a.b", "a.c")) // result: {"a":{"a":1}}
  
  However, if you are going to drop multiple nested fields, it is more optimal to extract out the nested struct before dropping multiple fields from it e.g.
  
  val df = sql("SELECT named_struct('a', named_struct('a', 1, 'b', 2)) struct_col") df.select($"struct_col".withField("a", $"struct_col.a".dropFields("b", "c"))) // result: {"a":{"a":1}}
  Parameters:
  
  fieldNames - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  3.1.0
- getField
  
  public Column getField(String fieldName)
  
  An expression that gets a field by name in a StructType.
  
  Parameters:
  
  fieldName - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- substr
  
  public Column substr(Column startPos, Column len)
  
  An expression that returns a substring.
  
  Parameters:
  
  startPos - expression for the starting position.
  
  len - expression for the length of the substring.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- substr
  
  public Column substr(int startPos, int len)
  
  An expression that returns a substring.
  
  Parameters:
  
  startPos - starting position.
  
  len - length of the substring.
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- contains
  
  public Column contains(Object other)
  
  Contains the other element. Returns a boolean column based on a string match.
  
  Parameters:
  
  other - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- startsWith
  
  public Column startsWith(Column other)
  
  String starts with. Returns a boolean column based on a string match.
  
  Parameters:
  
  other - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- startsWith
  
  public Column startsWith(String literal)
  
  String starts with another string literal. Returns a boolean column based on a string match.
  
  Parameters:
  
  literal - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- endsWith
  
  public Column endsWith(Column other)
  
  String ends with. Returns a boolean column based on a string match.
  
  Parameters:
  
  other - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- endsWith
  
  public Column endsWith(String literal)
  
  String ends with another string literal. Returns a boolean column based on a string match.
  
  Parameters:
  
  literal - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- alias
  
  public Column alias(String alias)
  Gives the column an alias. Same as as.
  // Renames colA to colB in select output. df.select($"colA".alias("colB"))
  Parameters:
  
  alias - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- as
  
  public Column as(String alias)
  Gives the column an alias.
  // Renames colA to colB in select output. df.select($"colA".as("colB"))
  
  If the current column has metadata associated with it, this metadata will be propagated to the new column. If this not desired, use the API as(alias: String, metadata: Metadata) with explicit metadata.
  Parameters:
  
  alias - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- as
  
  public Column as(scala.collection.immutable.Seq<String> aliases)
  (Scala-specific) Assigns the given aliases to the results of a table generating function.
  // Renames colA to colB in select output. df.select(explode($"myMap").as("key" :: "value" :: Nil))
  Parameters:
  
  aliases - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- as
  
  public Column as(String[] aliases)
  Assigns the given aliases to the results of a table generating function.
  // Renames colA to colB in select output. df.select(explode($"myMap").as("key" :: "value" :: Nil))
  Parameters:
  
  aliases - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- as
  
  public Column as(scala.Symbol alias)
  Gives the column an alias.
  // Renames colA to colB in select output. df.select($"colA".as("colB"))
  
  If the current column has metadata associated with it, this metadata will be propagated to the new column. If this not desired, use the API as(alias: String, metadata: Metadata) with explicit metadata.
  Parameters:
  
  alias - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- as
  
  public Column as(String alias, Metadata metadata)
  Gives the column an alias with metadata.
  val metadata: Metadata = ... df.select($"colA".as("colB", metadata))
  Parameters:
  
  alias - (undocumented)
  
  metadata - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- name
  
  public Column name(String alias)
  Gives the column a name (alias).
  // Renames colA to colB in select output. df.select($"colA".name("colB"))
  
  If the current column has metadata associated with it, this metadata will be propagated to the new column. If this not desired, use the API as(alias: String, metadata: Metadata) with explicit metadata.
  Parameters:
  
  alias - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  2.0.0
- cast
  
  public Column cast(DataType to)
  Casts the column to a different data type.
  // Casts colA to IntegerType. import org.apache.spark.sql.types.IntegerType df.select(df("colA").cast(IntegerType)) // equivalent to df.select(df("colA").cast("int"))
  Parameters:
  
  to - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- cast
  
  public Column cast(String to)
  Casts the column to a different data type, using the canonical string representation of the type. The supported types are: string, boolean, byte, short, int, long, float, double, decimal, date, timestamp.
  // Casts colA to integer. df.select(df("colA").cast("int"))
  Parameters:
  
  to - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- try_cast
  
  public Column try_cast(DataType to)
  Casts the column to a different data type and the result is null on failure.
  // Casts colA to IntegerType. import org.apache.spark.sql.types.IntegerType df.select(df("colA").try_cast(IntegerType)) // equivalent to df.select(df("colA").try_cast("int"))
  Parameters:
  
  to - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- try_cast
  
  public Column try_cast(String to)
  Casts the column to a different data type and the result is null on failure.
  // Casts colA to integer. df.select(df("colA").try_cast("int"))
  Parameters:
  
  to - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- desc
  
  public Column desc()
  Returns a sort expression based on the descending order of the column.
  // Scala df.sort(df("age").desc) // Java df.sort(df.col("age").desc());
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- desc_nulls_first
  
  public Column desc_nulls_first()
  Returns a sort expression based on the descending order of the column, and null values appear before non-null values.
  // Scala: sort a DataFrame by age column in descending order and null values appearing first. df.sort(df("age").desc_nulls_first) // Java df.sort(df.col("age").desc_nulls_first());
  Returns:
  
  (undocumented)
  
  Since:
  
  2.1.0
- desc_nulls_last
  
  public Column desc_nulls_last()
  Returns a sort expression based on the descending order of the column, and null values appear after non-null values.
  // Scala: sort a DataFrame by age column in descending order and null values appearing last. df.sort(df("age").desc_nulls_last) // Java df.sort(df.col("age").desc_nulls_last());
  Returns:
  
  (undocumented)
  
  Since:
  
  2.1.0
- asc
  
  public Column asc()
  Returns a sort expression based on ascending order of the column.
  // Scala: sort a DataFrame by age column in ascending order. df.sort(df("age").asc) // Java df.sort(df.col("age").asc());
  Returns:
  
  (undocumented)
  
  Since:
  
  1.3.0
- asc_nulls_first
  
  public Column asc_nulls_first()
  Returns a sort expression based on ascending order of the column, and null values return before non-null values.
  // Scala: sort a DataFrame by age column in ascending order and null values appearing first. df.sort(df("age").asc_nulls_first) // Java df.sort(df.col("age").asc_nulls_first());
  Returns:
  
  (undocumented)
  
  Since:
  
  2.1.0
- asc_nulls_last
  
  public Column asc_nulls_last()
  Returns a sort expression based on ascending order of the column, and null values appear after non-null values.
  // Scala: sort a DataFrame by age column in ascending order and null values appearing last. df.sort(df("age").asc_nulls_last) // Java df.sort(df.col("age").asc_nulls_last());
  Returns:
  
  (undocumented)
  
  Since:
  
  2.1.0
- explain
  
  public void explain(boolean extended)
  
  Prints the expression to the console for debugging purposes.
  
  Parameters:
  
  extended - (undocumented)
  
  Since:
  
  1.3.0
- bitwiseOR
  
  public Column bitwiseOR(Object other)
  Compute bitwise OR of this expression with another expression.
  df.select($"colA".bitwiseOR($"colB"))
  Parameters:
  
  other - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- bitwiseAND
  
  public Column bitwiseAND(Object other)
  Compute bitwise AND of this expression with another expression.
  df.select($"colA".bitwiseAND($"colB"))
  Parameters:
  
  other - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- bitwiseXOR
  
  public Column bitwiseXOR(Object other)
  Compute bitwise XOR of this expression with another expression.
  df.select($"colA".bitwiseXOR($"colB"))
  Parameters:
  
  other - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- over
  
  public Column over(WindowSpec window)
  Defines a windowing column.
  
  val w = Window.partitionBy("name").orderBy("id") df.select( sum("price").over(w.rangeBetween(Window.unboundedPreceding, 2)), avg("price").over(w.rowsBetween(Window.currentRow, 4)) )
  Parameters:
  
  window - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  1.4.0
- over
  
  public Column over()
  Defines an empty analytic clause. In this case the analytic function is applied and presented for all rows in the result set.
  
  df.select( sum("price").over(), avg("price").over() )
  Returns:
  
  (undocumented)
  
  Since:
  
  2.0.0
- outer
  
  public Column outer()
  Mark this column as an outer column if its expression refers to columns from an outer query. This is used to trigger lazy analysis of Spark Classic DataFrame, so that we can use it to build subquery expressions. Spark Connect DataFrame is always lazily analyzed and does not need to use this function.
  
  // Spark can't analyze this `df` now as it doesn't know how to resolve `t1.col`. val df = spark.table("t2").where($"t2.col" === $"t1.col".outer()) // Since this `df` is lazily analyzed, you won't see any error until you try to execute it. df.collect() // Fails with UNRESOLVED_COLUMN error. // Now Spark can resolve `t1.col` with the outer plan `spark.table("t1")`. spark.table("t1").where(df.exists())
  Returns:
  
  (undocumented)
  
  Since:
  
  4.0.0
- transform
  
  public Column transform(scala.Function1<Column,Column> f)
  Concise syntax for chaining custom transformations.
  def addPrefix(c: Column): Column = concat(lit("prefix_"), c) df.select($"name".transform(addPrefix)) // Chaining multiple transformations df.select($"name".transform(addPrefix).transform(upper))
  Parameters:
  
  f - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Since:
  
  4.1.0

Class Column

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface org.apache.spark.internal.Logging

Constructor Details

Column

Column

Method Details

isin

node

toString

equals

hashCode

as

apply

equalTo

notEqual

gt

lt

leq

geq

eqNullSafe

when

otherwise

between

isNaN

isNull

isNotNull

or

and

plus

minus

multiply

divide

mod

isin

isInCollection

isInCollection

isin

like

rlike

ilike

getItem

withField

dropFields

getField

substr

substr

contains

startsWith

startsWith

endsWith

endsWith

alias

as

as

as

as

as

name

cast

cast

try_cast

try_cast

desc

desc_nulls_first

desc_nulls_last

asc

asc_nulls_first

asc_nulls_last

explain

bitwiseOR

bitwiseAND

bitwiseXOR

over

over

outer