Class Column
- All Implemented Interfaces:
- org.apache.spark.internal.Logging,- org.apache.spark.sql.internal.TableValuedFunctionArgument
- Direct Known Subclasses:
- ColumnName,- TypedColumn
DataFrame.
 A new column can be constructed based on the input columns present in a DataFrame:
   df("columnName")            // On a specific `df` DataFrame.
   col("columnName")           // A generic column not yet associated with a DataFrame.
   col("columnName.field")     // Extracting a struct field
   col("`a.column.with.dots`") // Escape `.` in column names.
   $"columnName"               // Scala short hand for a named column.
 
 Column objects can be composed to form complex expressions:
 
   $"a" + 1
   $"a" === $"b"
 - Since:
- 1.3.0
- 
Nested Class SummaryNested classes/interfaces inherited from interface org.apache.spark.internal.Loggingorg.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter
- 
Constructor SummaryConstructors
- 
Method SummaryModifier and TypeMethodDescriptionGives the column an alias.Boolean AND.Extracts a value or values from a complex type.Gives the column an alias.Assigns the given aliases to the results of a table generating function.Gives the column an alias with metadata.<U> TypedColumn<Object,U> Provides a type hint about the expected return value of this column.(Scala-specific) Assigns the given aliases to the results of a table generating function.as(scala.Symbol alias) Gives the column an alias.asc()Returns a sort expression based on ascending order of the column.Returns a sort expression based on ascending order of the column, and null values return before non-null values.Returns a sort expression based on ascending order of the column, and null values appear after non-null values.True if the current column is between the lower bound and upper bound, inclusive.bitwiseAND(Object other) Compute bitwise AND of this expression with another expression.Compute bitwise OR of this expression with another expression.bitwiseXOR(Object other) Compute bitwise XOR of this expression with another expression.Casts the column to a different data type, using the canonical string representation of the type.Casts the column to a different data type.Contains the other element.desc()Returns a sort expression based on the descending order of the column.Returns a sort expression based on the descending order of the column, and null values appear before non-null values.Returns a sort expression based on the descending order of the column, and null values appear after non-null values.Division this expression by another expression.dropFields(scala.collection.immutable.Seq<String> fieldNames) An expression that drops fields inStructTypeby name.String ends with another string literal.String ends with.eqNullSafe(Object other) Equality test that is safe for null values.booleanEquality test.voidexplain(boolean extended) Prints the expression to the console for debugging purposes.Greater than or equal to an expression.An expression that gets a field by name in aStructType.An expression that gets an item at positionordinalout of an array, or gets a value by keykeyin aMapType.Greater than.inthashCode()SQL ILIKE expression (case insensitive LIKE).A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.A boolean expression that is evaluated to true if the value of this expression is contained by the provided Dataset/DataFrame.A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.isInCollection(Iterable<?> values) A boolean expression that is evaluated to true if the value of this expression is contained by the provided collection.isInCollection(scala.collection.Iterable<?> values) A boolean expression that is evaluated to true if the value of this expression is contained by the provided collection.isNaN()True if the current expression is NaN.True if the current expression is NOT null.isNull()True if the current expression is null.Less than or equal to.SQL like expression.Less than.Subtraction.Modulo (a.k.a.Multiplication of this expression and another expression.Gives the column a name (alias).org.apache.spark.sql.internal.ColumnNodenode()Inequality test.Boolean OR.Evaluates a list of conditions and returns one of multiple possible result expressions.outer()Mark this column as an outer column if its expression refers to columns from an outer query.over()Defines an empty analytic clause.over(WindowSpec window) Defines a windowing column.Sum of this expression and another expression.SQL RLIKE expression (LIKE with Regex).startsWith(String literal) String starts with another string literal.startsWith(Column other) String starts with.substr(int startPos, int len) An expression that returns a substring.An expression that returns a substring.toString()Concise syntax for chaining custom transformations.Casts the column to a different data type and the result is null on failure.Casts the column to a different data type and the result is null on failure.Evaluates a list of conditions and returns one of multiple possible result expressions.An expression that adds/replaces field inStructTypeby name.Methods inherited from interface org.apache.spark.internal.LogginginitializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logBasedOnLevel, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, MDC, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext
- 
Constructor Details- 
Columnpublic Column(org.apache.spark.sql.internal.ColumnNode node) 
- 
Column
 
- 
- 
Method Details- 
isinA boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.Note: Since the type of the elements in the list are inferred only during the run time, the elements will be "up-casted" to the most common type for comparison. For eg: 1) In the case of "Int vs String", the "Int" will be up-casted to "String" and the comparison will look like "String vs String". 2) In the case of "Float vs Double", the "Float" will be up-casted to "Double" and the comparison will look like "Double vs Double" - Parameters:
- list- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
nodepublic org.apache.spark.sql.internal.ColumnNode node()
- 
toString
- 
equals
- 
hashCodepublic int hashCode()
- 
asProvides a type hint about the expected return value of this column. This information can be used by operations such asselecton aDatasetto automatically convert the results into the correct JVM types.- Parameters:
- evidence$1- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
applyExtracts a value or values from a complex type. The following types of extraction are supported:- Given an Array, an integer ordinal can be used to retrieve a single value.
- Given a Map, a key of the correct type can be used to retrieve an individual value.
- Given a Struct, a string fieldName can be used to extract that field.
- Given an Array of Structs, a string fieldName can be used to extract filed of every struct in that array, and return an Array of fields.
 - Parameters:
- extraction- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
equalToEquality test.// Scala: df.filter( df("colA") === df("colB") ) // Java import static org.apache.spark.sql.functions.*; df.filter( col("colA").equalTo(col("colB")) );- Parameters:
- other- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
notEqualInequality test.// Scala: df.select( df("colA") !== df("colB") ) df.select( !(df("colA") === df("colB")) ) // Java: import static org.apache.spark.sql.functions.*; df.filter( col("colA").notEqual(col("colB")) );- Parameters:
- other- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
gtGreater than.// Scala: The following selects people older than 21. people.select( people("age") > lit(21) ) // Java: import static org.apache.spark.sql.functions.*; people.select( people.col("age").gt(21) );- Parameters:
- other- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
ltLess than.// Scala: The following selects people younger than 21. people.select( people("age") < 21 ) // Java: people.select( people.col("age").lt(21) );- Parameters:
- other- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
leqLess than or equal to.// Scala: The following selects people age 21 or younger than 21. people.select( people("age") <= 21 ) // Java: people.select( people.col("age").leq(21) );- Parameters:
- other- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
geqGreater than or equal to an expression.// Scala: The following selects people age 21 or older than 21. people.select( people("age") >= 21 ) // Java: people.select( people.col("age").geq(21) )- Parameters:
- other- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
eqNullSafeEquality test that is safe for null values.- Parameters:
- other- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
whenEvaluates a list of conditions and returns one of multiple possible result expressions. If otherwise is not defined at the end, null is returned for unmatched conditions.// Example: encoding gender string column into integer. // Scala: people.select(when(people("gender") === "male", 0) .when(people("gender") === "female", 1) .otherwise(2)) // Java: people.select(when(col("gender").equalTo("male"), 0) .when(col("gender").equalTo("female"), 1) .otherwise(2))- Parameters:
- condition- (undocumented)
- value- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
otherwiseEvaluates a list of conditions and returns one of multiple possible result expressions. If otherwise is not defined at the end, null is returned for unmatched conditions.// Example: encoding gender string column into integer. // Scala: people.select(when(people("gender") === "male", 0) .when(people("gender") === "female", 1) .otherwise(2)) // Java: people.select(when(col("gender").equalTo("male"), 0) .when(col("gender").equalTo("female"), 1) .otherwise(2))- Parameters:
- value- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
betweenTrue if the current column is between the lower bound and upper bound, inclusive.- Parameters:
- lowerBound- (undocumented)
- upperBound- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
isNaNTrue if the current expression is NaN.- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
isNullTrue if the current expression is null.- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
isNotNullTrue if the current expression is NOT null.- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
orBoolean OR.// Scala: The following selects people that are in school or employed. people.filter( people("inSchool") || people("isEmployed") ) // Java: people.filter( people.col("inSchool").or(people.col("isEmployed")) );- Parameters:
- other- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
andBoolean AND.// Scala: The following selects people that are in school and employed at the same time. people.select( people("inSchool") && people("isEmployed") ) // Java: people.select( people.col("inSchool").and(people.col("isEmployed")) );- Parameters:
- other- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
plusSum of this expression and another expression.// Scala: The following selects the sum of a person's height and weight. people.select( people("height") + people("weight") ) // Java: people.select( people.col("height").plus(people.col("weight")) );- Parameters:
- other- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
minusSubtraction. Subtract the other expression from this expression.// Scala: The following selects the difference between people's height and their weight. people.select( people("height") - people("weight") ) // Java: people.select( people.col("height").minus(people.col("weight")) );- Parameters:
- other- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
multiplyMultiplication of this expression and another expression.// Scala: The following multiplies a person's height by their weight. people.select( people("height") * people("weight") ) // Java: people.select( people.col("height").multiply(people.col("weight")) );- Parameters:
- other- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
divideDivision this expression by another expression.// Scala: The following divides a person's height by their weight. people.select( people("height") / people("weight") ) // Java: people.select( people.col("height").divide(people.col("weight")) );- Parameters:
- other- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
modModulo (a.k.a. remainder) expression.- Parameters:
- other- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
isinA boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.Note: Since the type of the elements in the list are inferred only during the run time, the elements will be "up-casted" to the most common type for comparison. For eg: 1) In the case of "Int vs String", the "Int" will be up-casted to "String" and the comparison will look like "String vs String". 2) In the case of "Float vs Double", the "Float" will be up-casted to "Double" and the comparison will look like "Double vs Double" - Parameters:
- list- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
isInCollectionA boolean expression that is evaluated to true if the value of this expression is contained by the provided collection.Note: Since the type of the elements in the collection are inferred only during the run time, the elements will be "up-casted" to the most common type for comparison. For eg: 1) In the case of "Int vs String", the "Int" will be up-casted to "String" and the comparison will look like "String vs String". 2) In the case of "Float vs Double", the "Float" will be up-casted to "Double" and the comparison will look like "Double vs Double" - Parameters:
- values- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
isInCollectionA boolean expression that is evaluated to true if the value of this expression is contained by the provided collection.Note: Since the type of the elements in the collection are inferred only during the run time, the elements will be "up-casted" to the most common type for comparison. For eg: 1) In the case of "Int vs String", the "Int" will be up-casted to "String" and the comparison will look like "String vs String". 2) In the case of "Float vs Double", the "Float" will be up-casted to "Double" and the comparison will look like "Double vs Double" - Parameters:
- values- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
isinA boolean expression that is evaluated to true if the value of this expression is contained by the provided Dataset/DataFrame.- Parameters:
- ds- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
likeSQL like expression. Returns a boolean column based on a SQL LIKE match.- Parameters:
- literal- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
rlikeSQL RLIKE expression (LIKE with Regex). Returns a boolean column based on a regex match.- Parameters:
- literal- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
ilikeSQL ILIKE expression (case insensitive LIKE).- Parameters:
- literal- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.3.0
 
- 
getItemAn expression that gets an item at positionordinalout of an array, or gets a value by keykeyin aMapType.- Parameters:
- key- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
withFieldAn expression that adds/replaces field inStructTypeby name.val df = sql("SELECT named_struct('a', 1, 'b', 2) struct_col") df.select($"struct_col".withField("c", lit(3))) // result: {"a":1,"b":2,"c":3} val df = sql("SELECT named_struct('a', 1, 'b', 2) struct_col") df.select($"struct_col".withField("b", lit(3))) // result: {"a":1,"b":3} val df = sql("SELECT CAST(NULL AS struct<a:int,b:int>) struct_col") df.select($"struct_col".withField("c", lit(3))) // result: null of type struct<a:int,b:int,c:int> val df = sql("SELECT named_struct('a', 1, 'b', 2, 'b', 3) struct_col") df.select($"struct_col".withField("b", lit(100))) // result: {"a":1,"b":100,"b":100} val df = sql("SELECT named_struct('a', named_struct('a', 1, 'b', 2)) struct_col") df.select($"struct_col".withField("a.c", lit(3))) // result: {"a":{"a":1,"b":2,"c":3}} val df = sql("SELECT named_struct('a', named_struct('b', 1), 'a', named_struct('c', 2)) struct_col") df.select($"struct_col".withField("a.c", lit(3))) // result: org.apache.spark.sql.AnalysisException: Ambiguous reference to fieldsThis method supports adding/replacing nested fields directly e.g. val df = sql("SELECT named_struct('a', named_struct('a', 1, 'b', 2)) struct_col") df.select($"struct_col".withField("a.c", lit(3)).withField("a.d", lit(4))) // result: {"a":{"a":1,"b":2,"c":3,"d":4}}However, if you are going to add/replace multiple nested fields, it is more optimal to extract out the nested struct before adding/replacing multiple fields e.g. val df = sql("SELECT named_struct('a', named_struct('a', 1, 'b', 2)) struct_col") df.select($"struct_col".withField("a", $"struct_col.a".withField("c", lit(3)).withField("d", lit(4)))) // result: {"a":{"a":1,"b":2,"c":3,"d":4}}- Parameters:
- fieldName- (undocumented)
- col- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.1.0
 
- 
dropFieldsAn expression that drops fields inStructTypeby name. This is a no-op if schema doesn't contain field name(s).val df = sql("SELECT named_struct('a', 1, 'b', 2) struct_col") df.select($"struct_col".dropFields("b")) // result: {"a":1} val df = sql("SELECT named_struct('a', 1, 'b', 2) struct_col") df.select($"struct_col".dropFields("c")) // result: {"a":1,"b":2} val df = sql("SELECT named_struct('a', 1, 'b', 2, 'c', 3) struct_col") df.select($"struct_col".dropFields("b", "c")) // result: {"a":1} val df = sql("SELECT named_struct('a', 1, 'b', 2) struct_col") df.select($"struct_col".dropFields("a", "b")) // result: org.apache.spark.sql.AnalysisException: [DATATYPE_MISMATCH.CANNOT_DROP_ALL_FIELDS] Cannot resolve "update_fields(struct_col, dropfield(), dropfield())" due to data type mismatch: Cannot drop all fields in struct.; val df = sql("SELECT CAST(NULL AS struct<a:int,b:int>) struct_col") df.select($"struct_col".dropFields("b")) // result: null of type struct<a:int> val df = sql("SELECT named_struct('a', 1, 'b', 2, 'b', 3) struct_col") df.select($"struct_col".dropFields("b")) // result: {"a":1} val df = sql("SELECT named_struct('a', named_struct('a', 1, 'b', 2)) struct_col") df.select($"struct_col".dropFields("a.b")) // result: {"a":{"a":1}} val df = sql("SELECT named_struct('a', named_struct('b', 1), 'a', named_struct('c', 2)) struct_col") df.select($"struct_col".dropFields("a.c")) // result: org.apache.spark.sql.AnalysisException: Ambiguous reference to fieldsThis method supports dropping multiple nested fields directly e.g. val df = sql("SELECT named_struct('a', named_struct('a', 1, 'b', 2)) struct_col") df.select($"struct_col".dropFields("a.b", "a.c")) // result: {"a":{"a":1}}However, if you are going to drop multiple nested fields, it is more optimal to extract out the nested struct before dropping multiple fields from it e.g. val df = sql("SELECT named_struct('a', named_struct('a', 1, 'b', 2)) struct_col") df.select($"struct_col".withField("a", $"struct_col.a".dropFields("b", "c"))) // result: {"a":{"a":1}}- Parameters:
- fieldNames- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.1.0
 
- 
getFieldAn expression that gets a field by name in aStructType.- Parameters:
- fieldName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
substrAn expression that returns a substring.- Parameters:
- startPos- expression for the starting position.
- len- expression for the length of the substring.
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
substrAn expression that returns a substring.- Parameters:
- startPos- starting position.
- len- length of the substring.
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
containsContains the other element. Returns a boolean column based on a string match.- Parameters:
- other- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
startsWithString starts with. Returns a boolean column based on a string match.- Parameters:
- other- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
startsWithString starts with another string literal. Returns a boolean column based on a string match.- Parameters:
- literal- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
endsWithString ends with. Returns a boolean column based on a string match.- Parameters:
- other- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
endsWithString ends with another string literal. Returns a boolean column based on a string match.- Parameters:
- literal- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
aliasGives the column an alias. Same asas.// Renames colA to colB in select output. df.select($"colA".alias("colB"))- Parameters:
- alias- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
asGives the column an alias.// Renames colA to colB in select output. df.select($"colA".as("colB"))If the current column has metadata associated with it, this metadata will be propagated to the new column. If this not desired, use the API as(alias: String, metadata: Metadata)with explicit metadata.- Parameters:
- alias- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
as(Scala-specific) Assigns the given aliases to the results of a table generating function.// Renames colA to colB in select output. df.select(explode($"myMap").as("key" :: "value" :: Nil))- Parameters:
- aliases- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
asAssigns the given aliases to the results of a table generating function.// Renames colA to colB in select output. df.select(explode($"myMap").as("key" :: "value" :: Nil))- Parameters:
- aliases- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
asGives the column an alias.// Renames colA to colB in select output. df.select($"colA".as("colB"))If the current column has metadata associated with it, this metadata will be propagated to the new column. If this not desired, use the API as(alias: String, metadata: Metadata)with explicit metadata.- Parameters:
- alias- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
asGives the column an alias with metadata.val metadata: Metadata = ... df.select($"colA".as("colB", metadata))- Parameters:
- alias- (undocumented)
- metadata- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
nameGives the column a name (alias).// Renames colA to colB in select output. df.select($"colA".name("colB"))If the current column has metadata associated with it, this metadata will be propagated to the new column. If this not desired, use the API as(alias: String, metadata: Metadata)with explicit metadata.- Parameters:
- alias- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.0.0
 
- 
castCasts the column to a different data type.// Casts colA to IntegerType. import org.apache.spark.sql.types.IntegerType df.select(df("colA").cast(IntegerType)) // equivalent to df.select(df("colA").cast("int"))- Parameters:
- to- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
castCasts the column to a different data type, using the canonical string representation of the type. The supported types are:string,boolean,byte,short,int,long,float,double,decimal,date,timestamp.// Casts colA to integer. df.select(df("colA").cast("int"))- Parameters:
- to- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
try_castCasts the column to a different data type and the result is null on failure.// Casts colA to IntegerType. import org.apache.spark.sql.types.IntegerType df.select(df("colA").try_cast(IntegerType)) // equivalent to df.select(df("colA").try_cast("int"))- Parameters:
- to- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
try_castCasts the column to a different data type and the result is null on failure.// Casts colA to integer. df.select(df("colA").try_cast("int"))- Parameters:
- to- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
descReturns a sort expression based on the descending order of the column.// Scala df.sort(df("age").desc) // Java df.sort(df.col("age").desc());- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
desc_nulls_firstReturns a sort expression based on the descending order of the column, and null values appear before non-null values.// Scala: sort a DataFrame by age column in descending order and null values appearing first. df.sort(df("age").desc_nulls_first) // Java df.sort(df.col("age").desc_nulls_first());- Returns:
- (undocumented)
- Since:
- 2.1.0
 
- 
desc_nulls_lastReturns a sort expression based on the descending order of the column, and null values appear after non-null values.// Scala: sort a DataFrame by age column in descending order and null values appearing last. df.sort(df("age").desc_nulls_last) // Java df.sort(df.col("age").desc_nulls_last());- Returns:
- (undocumented)
- Since:
- 2.1.0
 
- 
ascReturns a sort expression based on ascending order of the column.// Scala: sort a DataFrame by age column in ascending order. df.sort(df("age").asc) // Java df.sort(df.col("age").asc());- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
asc_nulls_firstReturns a sort expression based on ascending order of the column, and null values return before non-null values.// Scala: sort a DataFrame by age column in ascending order and null values appearing first. df.sort(df("age").asc_nulls_first) // Java df.sort(df.col("age").asc_nulls_first());- Returns:
- (undocumented)
- Since:
- 2.1.0
 
- 
asc_nulls_lastReturns a sort expression based on ascending order of the column, and null values appear after non-null values.// Scala: sort a DataFrame by age column in ascending order and null values appearing last. df.sort(df("age").asc_nulls_last) // Java df.sort(df.col("age").asc_nulls_last());- Returns:
- (undocumented)
- Since:
- 2.1.0
 
- 
explainpublic void explain(boolean extended) Prints the expression to the console for debugging purposes.- Parameters:
- extended- (undocumented)
- Since:
- 1.3.0
 
- 
bitwiseORCompute bitwise OR of this expression with another expression.df.select($"colA".bitwiseOR($"colB"))- Parameters:
- other- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
bitwiseANDCompute bitwise AND of this expression with another expression.df.select($"colA".bitwiseAND($"colB"))- Parameters:
- other- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
bitwiseXORCompute bitwise XOR of this expression with another expression.df.select($"colA".bitwiseXOR($"colB"))- Parameters:
- other- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
overDefines a windowing column.val w = Window.partitionBy("name").orderBy("id") df.select( sum("price").over(w.rangeBetween(Window.unboundedPreceding, 2)), avg("price").over(w.rowsBetween(Window.currentRow, 4)) )- Parameters:
- window- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
overDefines an empty analytic clause. In this case the analytic function is applied and presented for all rows in the result set.df.select( sum("price").over(), avg("price").over() )- Returns:
- (undocumented)
- Since:
- 2.0.0
 
- 
outerMark this column as an outer column if its expression refers to columns from an outer query. This is used to trigger lazy analysis of Spark Classic DataFrame, so that we can use it to build subquery expressions. Spark Connect DataFrame is always lazily analyzed and does not need to use this function.// Spark can't analyze this `df` now as it doesn't know how to resolve `t1.col`. val df = spark.table("t2").where($"t2.col" === $"t1.col".outer()) // Since this `df` is lazily analyzed, you won't see any error until you try to execute it. df.collect() // Fails with UNRESOLVED_COLUMN error. // Now Spark can resolve `t1.col` with the outer plan `spark.table("t1")`. spark.table("t1").where(df.exists())- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
transformConcise syntax for chaining custom transformations.def addPrefix(c: Column): Column = concat(lit("prefix_"), c) df.select($"name".transform(addPrefix)) // Chaining multiple transformations df.select($"name".transform(addPrefix).transform(upper))- Parameters:
- f- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
 
-