Class functions
 You can call the functions defined here by two ways: _FUNC_(...) and
 functions.expr("_FUNC_(...)").
 
 As an example, regr_count is a function that is defined here. You can use
 regr_count(col("yCol", col("xCol"))) to invoke the regr_count function. This way the
 programming language's compiler ensures regr_count exists and is of the proper form. You can
 also use expr("regr_count(yCol, xCol)") function to invoke the same function. In this case,
 Spark itself will ensure regr_count exists when it analyzes the query.
 
You can find the entire list of functions at SQL API documentation of your Spark version, see also the latest list
 This function APIs usually have methods with Column signature only because it can support not
 only Column but also other types such as a native string. The other variants currently exist
 for historical reasons.
 
- Since:
- 1.3.0
- 
Nested Class SummaryNested Classes
- 
Constructor SummaryConstructors
- 
Method SummaryModifier and TypeMethodDescriptionstatic ColumnComputes the absolute value of a numeric value.static Columnstatic Columnstatic Columnstatic Columnstatic Columnadd_months(Column startDate, int numMonths) Returns the date that isnumMonthsafterstartDate.static Columnadd_months(Column startDate, Column numMonths) Returns the date that isnumMonthsafterstartDate.static Columnaes_decrypt(Column input, Column key) Returns a decrypted value ofinput.static Columnaes_decrypt(Column input, Column key, Column mode) Returns a decrypted value ofinput.static Columnaes_decrypt(Column input, Column key, Column mode, Column padding) Returns a decrypted value ofinput.static ColumnReturns a decrypted value ofinputusing AES inmodewithpadding.static Columnaes_encrypt(Column input, Column key) Returns an encrypted value ofinput.static Columnaes_encrypt(Column input, Column key, Column mode) Returns an encrypted value ofinput.static Columnaes_encrypt(Column input, Column key, Column mode, Column padding) Returns an encrypted value ofinput.static ColumnReturns an encrypted value ofinput.static ColumnReturns an encrypted value ofinputusing AES in givenmodewith the specifiedpadding.static ColumnApplies a binary operator to an initial state and all elements in the array, and reduces this to a single state.static Columnaggregate(Column expr, Column initialValue, scala.Function2<Column, Column, Column> merge, scala.Function1<Column, Column> finish) Applies a binary operator to an initial state and all elements in the array, and reduces this to a single state.static ColumnAggregate function: returns true if at least one value ofeis true.static ColumnAggregate function: returns some value ofefor a group of rows.static ColumnAggregate function: returns some value ofefor a group of rows.static Columnapprox_count_distinct(String columnName) Aggregate function: returns the approximate number of distinct items in a group.static Columnapprox_count_distinct(String columnName, double rsd) Aggregate function: returns the approximate number of distinct items in a group.static ColumnAggregate function: returns the approximate number of distinct items in a group.static Columnapprox_count_distinct(Column e, double rsd) Aggregate function: returns the approximate number of distinct items in a group.static Columnapprox_percentile(Column e, Column percentage, Column accuracy) Aggregate function: returns the approximatepercentileof the numeric columncolwhich is the smallest value in the orderedcolvalues (sorted from least to greatest) such that no more thanpercentageofcolvalues is less than the value or equal to that value.static ColumnapproxCountDistinct(String columnName) Deprecated.Use approx_count_distinct.static ColumnapproxCountDistinct(String columnName, double rsd) Deprecated.Use approx_count_distinct.static ColumnDeprecated.Use approx_count_distinct.static ColumnapproxCountDistinct(Column e, double rsd) Deprecated.Use approx_count_distinct.static ColumnCreates a new array column.static ColumnCreates a new array column.static ColumnCreates a new array column.static ColumnCreates a new array column.static ColumnAggregate function: returns a list of objects with duplicates.static Columnarray_append(Column column, Object element) Returns an ARRAY containing all elements from the source ARRAY as well as the new element.static Columnarray_compact(Column column) Remove all null elements from the given array.static Columnarray_contains(Column column, Object value) Returns null if the array is null, true if the array containsvalue, and false otherwise.static ColumnRemoves duplicate values from the array.static Columnarray_except(Column col1, Column col2) Returns an array of the elements in the first array but not in the second array, without duplicates.static Columnarray_insert(Column arr, Column pos, Column value) Adds an item into a given array at a specified positionstatic Columnarray_intersect(Column col1, Column col2) Returns an array of the elements in the intersection of the given two arrays, without duplicates.static Columnarray_join(Column column, String delimiter) Concatenates the elements ofcolumnusing thedelimiter.static Columnarray_join(Column column, String delimiter, String nullReplacement) Concatenates the elements ofcolumnusing thedelimiter.static ColumnReturns the maximum value in the array.static ColumnReturns the minimum value in the array.static Columnarray_position(Column column, Object value) Locates the position of the first occurrence of the value in the given array as long.static Columnarray_prepend(Column column, Object element) Returns an array containing value as well as all elements from array.static Columnarray_remove(Column column, Object element) Remove all elements that equal to element from the given array.static Columnarray_repeat(Column e, int count) Creates an array containing the left argument repeated the number of times given by the right argument.static Columnarray_repeat(Column left, Column right) Creates an array containing the left argument repeated the number of times given by the right argument.static Columnarray_size(Column e) Returns the total number of elements in the array.static Columnarray_sort(Column e) Sorts the input array in ascending order.static Columnarray_sort(Column e, scala.Function2<Column, Column, Column> comparator) Sorts the input array based on the given comparator function.static Columnarray_union(Column col1, Column col2) Returns an array of the elements in the union of the given two arrays, without duplicates.static Columnarrays_overlap(Column a1, Column a2) Returnstrueifa1anda2have at least one non-null element in common.static Columnarrays_zip(Column... e) Returns a merged array of structs in which the N-th struct contains all N-th values of input arrays.static Columnarrays_zip(scala.collection.immutable.Seq<Column> e) Returns a merged array of structs in which the N-th struct contains all N-th values of input arrays.static ColumnReturns a sort expression based on ascending order of the column.static Columnasc_nulls_first(String columnName) Returns a sort expression based on ascending order of the column, and null values return before non-null values.static Columnasc_nulls_last(String columnName) Returns a sort expression based on ascending order of the column, and null values appear after non-null values.static ColumnComputes the numeric value of the first character of the string column, and returns the result as an int column.static Columnstatic Columnstatic Columnstatic Columnstatic ColumnReturns null if the condition is true, and throws an exception otherwise.static Columnassert_true(Column c, Column e) Returns null if the condition is true; throws an exception with the error message otherwise.static Columnstatic Columnstatic Columnstatic Columnstatic Columnstatic Columnstatic Columnstatic Columnstatic Columnstatic Columnstatic Columnstatic Columnstatic ColumnAggregate function: returns the average of the values in a group.static ColumnAggregate function: returns the average of the values in a group.static ColumnComputes the BASE64 encoding of a binary column and returns it as a string column.static ColumnAn expression that returns the string representation of the binary value of the given long column.static ColumnAn expression that returns the string representation of the binary value of the given long column.static ColumnAggregate function: returns the bitwise AND of all non-null input values, or null if none.static ColumnReturns the number of bits that are set in the argument expr as an unsigned 64-bit integer, or NULL if the argument is NULL.static ColumnReturns the value of the bit (0 or 1) at the specified position.static Columnbit_length(Column e) Calculates the bit length for the specified string column.static ColumnAggregate function: returns the bitwise OR of all non-null input values, or null if none.static ColumnAggregate function: returns the bitwise XOR of all non-null input values, or null if none.static Columnbitmap_and_agg(Column col) Returns a bitmap that is the bitwise AND of all of the bitmaps from the input column.static ColumnReturns the bucket number for the given input column.static ColumnReturns the bit position for the given input column.static ColumnReturns a bitmap with the positions of the bits set from all the values from the input column.static Columnbitmap_count(Column col) Returns the number of set bits in the input bitmap.static Columnbitmap_or_agg(Column col) Returns a bitmap that is the bitwise OR of all of the bitmaps from the input column.static ColumnComputes bitwise NOT (~) of a number.static ColumnbitwiseNOT(Column e) Deprecated.Use bitwise_not.static ColumnAggregate function: returns true if all values ofeare true.static ColumnAggregate function: returns true if at least one value ofeis true.static <U> DatasetMarks a DataFrame as small enough for use in broadcast joins.static ColumnReturns the value of the columnerounded to 0 decimal places with HALF_EVEN round mode.static ColumnRound the value ofetoscaledecimal places with HALF_EVEN round mode ifscaleis greater than or equal to 0 or at integral part whenscaleis less than 0.static ColumnRound the value ofetoscaledecimal places with HALF_EVEN round mode ifscaleis greater than or equal to 0 or at integral part whenscaleis less than 0.static ColumnRemoves the leading and trailing space characters fromstr.static ColumnRemove the leading and trailingtrimcharacters fromstr.static Column(Java-specific) A transform for any type that partitions by a hash of the input column.static Column(Java-specific) A transform for any type that partitions by a hash of the input column.static Columncall_function(String funcName, Column... cols) Call a SQL function.static Columncall_function(String funcName, scala.collection.immutable.Seq<Column> cols) Call a SQL function.static ColumnCall an user-defined function.static ColumnCall an user-defined function.static ColumnCall an user-defined function.static ColumnDeprecated.Use call_udf.static ColumnReturns length of array or map.static ColumnComputes the cube-root of the given column.static ColumnComputes the cube-root of the given value.static ColumnComputes the ceiling of the given value ofeto 0 decimal places.static ColumnComputes the ceiling of the given value ofeto 0 decimal places.static ColumnComputes the ceiling of the given value ofetoscaledecimal places.static ColumnComputes the ceiling of the given value ofeto 0 decimal places.static ColumnComputes the ceiling of the given value ofetoscaledecimal places.static Columnchar_length(Column str) Returns the character length of string data or number of bytes of binary data.static Columncharacter_length(Column str) Returns the character length of string data or number of bytes of binary data.static ColumnReturns the ASCII character having the binary equivalent ton.static ColumnReturns the first column that is not null, or null if all inputs are null.static ColumnReturns the first column that is not null, or null if all inputs are null.static ColumnReturns aColumnbased on the given column name.static ColumnMarks a given column with specified collation.static ColumnReturns the collation name of a given column.static Columncollect_list(String columnName) Aggregate function: returns a list of objects with duplicates.static ColumnAggregate function: returns a list of objects with duplicates.static Columncollect_set(String columnName) Aggregate function: returns a set of objects with duplicate elements eliminated.static ColumnAggregate function: returns a set of objects with duplicate elements eliminated.static ColumnReturns aColumnbased on the given column name.static ColumnConcatenates multiple input columns together into a single column.static ColumnConcatenates multiple input columns together into a single column.static ColumnConcatenates multiple input string columns together into a single string column, using the given separator.static ColumnConcatenates multiple input string columns together into a single string column, using the given separator.static ColumnReturns a boolean.static ColumnConvert a number in a string column from one base to another.static Columnconvert_timezone(Column targetTz, Column sourceTs) Converts the timestamp without time zonesourceTsfrom the current time zone totargetTz.static Columnconvert_timezone(Column sourceTz, Column targetTz, Column sourceTs) Converts the timestamp without time zonesourceTsfrom thesourceTztime zone totargetTz.static ColumnAggregate function: returns the Pearson Correlation Coefficient for two columns.static ColumnAggregate function: returns the Pearson Correlation Coefficient for two columns.static Columnstatic Columnstatic Columnstatic Columnstatic Columnstatic TypedColumn<Object,Object> Aggregate function: returns the number of items in a group.static ColumnAggregate function: returns the number of items in a group.static Columncount_distinct(Column expr, Column... exprs) Aggregate function: returns the number of distinct items in a group.static Columncount_distinct(Column expr, scala.collection.immutable.Seq<Column> exprs) Aggregate function: returns the number of distinct items in a group.static ColumnAggregate function: returns the number ofTRUEvalues for the expression.static Columncount_min_sketch(Column e, Column eps, Column confidence) Returns a count-min sketch of a column with the given esp, confidence and seed.static Columncount_min_sketch(Column e, Column eps, Column confidence, Column seed) Returns a count-min sketch of a column with the given esp, confidence and seed.static ColumncountDistinct(String columnName, String... columnNames) Aggregate function: returns the number of distinct items in a group.static ColumncountDistinct(String columnName, scala.collection.immutable.Seq<String> columnNames) Aggregate function: returns the number of distinct items in a group.static ColumncountDistinct(Column expr, Column... exprs) Aggregate function: returns the number of distinct items in a group.static ColumncountDistinct(Column expr, scala.collection.immutable.Seq<Column> exprs) Aggregate function: returns the number of distinct items in a group.static ColumnAggregate function: returns the population covariance for two columns.static ColumnAggregate function: returns the population covariance for two columns.static Columncovar_samp(String columnName1, String columnName2) Aggregate function: returns the sample covariance for two columns.static Columncovar_samp(Column column1, Column column2) Aggregate function: returns the sample covariance for two columns.static ColumnCalculates the cyclic redundancy check value (CRC32) of a binary column and returns the value as a bigint.static Columnstatic ColumnWindow function: returns the cumulative distribution of values within a window partition, i.e.static Columncurdate()Returns the current date at the start of query evaluation as a date column.static ColumnReturns the current catalog.static ColumnReturns the current database.static ColumnReturns the current date at the start of query evaluation as a date column.static ColumnReturns the current schema.static ColumnReturns the current time at the start of query evaluation.static Columncurrent_time(int precision) Returns the current time at the start of query evaluation.static ColumnReturns the current timestamp at the start of query evaluation as a timestamp column.static ColumnReturns the current session local timezone.static ColumnReturns the user name of current execution context.static ColumnReturns the date that isdaysdays afterstartstatic ColumnReturns the date that isdaysdays afterstartstatic ColumnReturns the number of days fromstarttoend.static Columndate_format(Column dateExpr, String format) Converts a date/timestamp/string to a value of string in the format specified by the date format given by the second argument.static Columndate_from_unix_date(Column days) Create date from the number ofdayssince 1970-01-01.static ColumnExtracts a part of the date/timestamp or interval source.static ColumnReturns the date that isdaysdays beforestartstatic ColumnReturns the date that isdaysdays beforestartstatic Columndate_trunc(String format, Column timestamp) Returns timestamp truncated to the unit specified by the format.static ColumnReturns the date that isdaysdays afterstartstatic ColumnReturns the number of days fromstarttoend.static ColumnExtracts a part of the date/timestamp or interval source.static ColumnExtracts the day of the month as an integer from a given date/timestamp/string.static ColumnExtracts the three-letter abbreviated day name from a given date/timestamp/string.static Columndayofmonth(Column e) Extracts the day of the month as an integer from a given date/timestamp/string.static ColumnExtracts the day of the week as an integer from a given date/timestamp/string.static ColumnExtracts the day of the year as an integer from a given date/timestamp/string.static Column(Java-specific) A transform for timestamps and dates to partition data into days.static ColumnComputes the first argument into a string from a binary using the provided character set (one of 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16', 'UTF-32').static ColumnConverts an angle measured in radians to an approximately equivalent angle measured in degrees.static ColumnConverts an angle measured in radians to an approximately equivalent angle measured in degrees.static ColumnWindow function: returns the rank of rows within a window partition, without any gaps.static ColumnReturns a sort expression based on the descending order of the column.static Columndesc_nulls_first(String columnName) Returns a sort expression based on the descending order of the column, and null values appear before non-null values.static Columndesc_nulls_last(String columnName) Returns a sort expression based on the descending order of the column, and null values appear after non-null values.static Columne()Returns Euler's number.static Columnelement_at(Column column, Object value) Returns element of array at given index in value if column is array.static ColumnReturns then-th input, e.g., returnsinput2whennis 2.static ColumnReturns then-th input, e.g., returnsinput2whennis 2.static ColumnComputes the first argument into a binary from a string using the provided character set (one of 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16', 'UTF-32').static ColumnReturns a boolean.static Columnequal_null(Column col1, Column col2) Returns same result as the EQUAL(=) operator for non-null operands, but returns true if both are null, false if one of the them is null.static ColumnAggregate function: returns true if all values ofeare true.static ColumnReturns whether a predicate holds for one or more elements in the array.static ColumnComputes the exponential of the given column.static ColumnComputes the exponential of the given value.static ColumnCreates a new row for each element in the given array or map column.static ColumnCreates a new row for each element in the given array or map column.static ColumnComputes the exponential of the given column minus one.static ColumnComputes the exponential of the given value minus one.static ColumnParses the expression string into the column that it represents, similar toDataset.selectExpr(java.lang.String...).static ColumnExtracts a part of the date/timestamp or interval source.static ColumnComputes the factorial of the given value.static ColumnReturns an array of elements for which a predicate holds in a given array.static ColumnReturns an array of elements for which a predicate holds in a given array.static Columnfind_in_set(Column str, Column strArray) Returns the index (1-based) of the given string (str) in the comma-delimited list (strArray).static ColumnAggregate function: returns the first value of a column in a group.static ColumnAggregate function: returns the first value of a column in a group.static ColumnAggregate function: returns the first value in a group.static ColumnAggregate function: returns the first value in a group.static ColumnAggregate function: returns the first value in a group.static Columnfirst_value(Column e, Column ignoreNulls) Aggregate function: returns the first value in a group.static ColumnCreates a single array from an array of arrays.static ColumnComputes the floor of the given column value to 0 decimal places.static ColumnComputes the floor of the given value ofeto 0 decimal places.static ColumnComputes the floor of the given value ofetoscaledecimal places.static ColumnReturns whether a predicate holds for every element in the array.static Columnformat_number(Column x, int d) Formats numeric column x to a format like '#,###,###.##', rounded to d decimal places with HALF_EVEN round mode, and returns the result as a string column.static Columnformat_string(String format, Column... arguments) Formats the arguments in printf-style and returns the result as a string column.static Columnformat_string(String format, scala.collection.immutable.Seq<Column> arguments) Formats the arguments in printf-style and returns the result as a string column.static Column(Java-specific) Parses a column containing a CSV string into aStructTypewith the specified schema.static Columnfrom_csv(Column e, StructType schema, scala.collection.immutable.Map<String, String> options) Parses a column containing a CSV string into aStructTypewith the specified schema.static Column(Java-specific) Parses a column containing a JSON string into aMapTypewithStringTypeas keys type,StructTypeorArrayTypewith the specified schema.static Column(Scala-specific) Parses a column containing a JSON string into aMapTypewithStringTypeas keys type,StructTypeorArrayTypewith the specified schema.static Column(Scala-specific) Parses a column containing a JSON string into aMapTypewithStringTypeas keys type,StructTypeorArrayTypeofStructTypes with the specified schema.static Column(Java-specific) Parses a column containing a JSON string into aMapTypewithStringTypeas keys type,StructTypeorArrayTypeofStructTypes with the specified schema.static ColumnParses a column containing a JSON string into aMapTypewithStringTypeas keys type,StructTypeorArrayTypewith the specified schema.static Column(Java-specific) Parses a column containing a JSON string into aMapTypewithStringTypeas keys type,StructTypeorArrayTypewith the specified schema.static Column(Scala-specific) Parses a column containing a JSON string into aMapTypewithStringTypeas keys type,StructTypeorArrayTypewith the specified schema.static Columnfrom_json(Column e, StructType schema) Parses a column containing a JSON string into aStructTypewith the specified schema.static Column(Java-specific) Parses a column containing a JSON string into aStructTypewith the specified schema.static Columnfrom_json(Column e, StructType schema, scala.collection.immutable.Map<String, String> options) (Scala-specific) Parses a column containing a JSON string into aStructTypewith the specified schema.static Columnfrom_unixtime(Column ut) Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the yyyy-MM-dd HH:mm:ss format.static Columnfrom_unixtime(Column ut, String f) Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the given format.static Columnfrom_utc_timestamp(Column ts, String tz) Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone.static Columnfrom_utc_timestamp(Column ts, Column tz) Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone.static Column(Java-specific) Parses a column containing a XML string into aStructTypewith the specified schema.static Column(Java-specific) Parses a column containing a XML string into aStructTypewith the specified schema.static Column(Java-specific) Parses a column containing a XML string into aStructTypewith the specified schema.static Columnfrom_xml(Column e, StructType schema) Parses a column containing a XML string into the data type corresponding to the specified schema.static ColumnParses a column containing a XML string into the data type corresponding to the specified schema.static ColumnReturns element of array at given (0-based) index.static Columnget_json_object(Column e, String path) Extracts json object from a json string based on json path specified, and returns json string of the extracted json object.static ColumnReturns the value of the bit (0 or 1) at the specified position.static ColumnReturns the greatest value of the list of column names, skipping null values.static ColumnReturns the greatest value of the list of column names, skipping null values.static ColumnReturns the greatest value of the list of values, skipping null values.static ColumnReturns the greatest value of the list of values, skipping null values.static ColumnAggregate function: indicates whether a specified column in a GROUP BY list is aggregated or not, returns 1 for aggregated or 0 for not aggregated in the result set.static ColumnAggregate function: indicates whether a specified column in a GROUP BY list is aggregated or not, returns 1 for aggregated or 0 for not aggregated in the result set.static Columngrouping_id(String colName, String... colNames) Aggregate function: returns the level of grouping, equals tostatic Columngrouping_id(String colName, scala.collection.immutable.Seq<String> colNames) Aggregate function: returns the level of grouping, equals tostatic Columngrouping_id(Column... cols) Aggregate function: returns the level of grouping, equals tostatic Columngrouping_id(scala.collection.immutable.Seq<Column> cols) Aggregate function: returns the level of grouping, equals tostatic ColumnCalculates the hash code of given columns, and returns the result as an int column.static ColumnCalculates the hash code of given columns, and returns the result as an int column.static ColumnComputes hex value of the given column.static Columnhistogram_numeric(Column e, Column nBins) Aggregate function: computes a histogram on numeric 'expr' using nb bins.static Columnhll_sketch_agg(String columnName) Aggregate function: returns the updatable binary representation of the Datasketches HllSketch configured with default lgConfigK value.static Columnhll_sketch_agg(String columnName, int lgConfigK) Aggregate function: returns the updatable binary representation of the Datasketches HllSketch configured with lgConfigK arg.static ColumnAggregate function: returns the updatable binary representation of the Datasketches HllSketch configured with default lgConfigK value.static Columnhll_sketch_agg(Column e, int lgConfigK) Aggregate function: returns the updatable binary representation of the Datasketches HllSketch configured with lgConfigK arg.static Columnhll_sketch_agg(Column e, Column lgConfigK) Aggregate function: returns the updatable binary representation of the Datasketches HllSketch configured with lgConfigK arg.static Columnhll_sketch_estimate(String columnName) Returns the estimated number of unique values given the binary representation of a Datasketches HllSketch.static ColumnReturns the estimated number of unique values given the binary representation of a Datasketches HllSketch.static ColumnMerges two binary representations of Datasketches HllSketch objects, using a Datasketches Union object.static ColumnMerges two binary representations of Datasketches HllSketch objects, using a Datasketches Union object.static ColumnMerges two binary representations of Datasketches HllSketch objects, using a Datasketches Union object.static ColumnMerges two binary representations of Datasketches HllSketch objects, using a Datasketches Union object.static Columnhll_union_agg(String columnName) Aggregate function: returns the updatable binary representation of the Datasketches HllSketch, generated by merging previously created Datasketches HllSketch instances via a Datasketches Union instance.static Columnhll_union_agg(String columnName, boolean allowDifferentLgConfigK) Aggregate function: returns the updatable binary representation of the Datasketches HllSketch, generated by merging previously created Datasketches HllSketch instances via a Datasketches Union instance.static ColumnAggregate function: returns the updatable binary representation of the Datasketches HllSketch, generated by merging previously created Datasketches HllSketch instances via a Datasketches Union instance.static Columnhll_union_agg(Column e, boolean allowDifferentLgConfigK) Aggregate function: returns the updatable binary representation of the Datasketches HllSketch, generated by merging previously created Datasketches HllSketch instances via a Datasketches Union instance.static Columnhll_union_agg(Column e, Column allowDifferentLgConfigK) Aggregate function: returns the updatable binary representation of the Datasketches HllSketch, generated by merging previously created Datasketches HllSketch instances via a Datasketches Union instance.static ColumnExtracts the hours as an integer from a given date/time/timestamp/string.static Column(Java-specific) A transform for timestamps to partition data into hours.static ColumnComputessqrt(a^2^ + b^2^)without intermediate overflow or underflow.static ColumnComputessqrt(a^2^ + b^2^)without intermediate overflow or underflow.static ColumnComputessqrt(a^2^ + b^2^)without intermediate overflow or underflow.static ColumnComputessqrt(a^2^ + b^2^)without intermediate overflow or underflow.static ColumnComputessqrt(a^2^ + b^2^)without intermediate overflow or underflow.static ColumnComputessqrt(a^2^ + b^2^)without intermediate overflow or underflow.static ColumnComputessqrt(a^2^ + b^2^)without intermediate overflow or underflow.static ColumnComputessqrt(a^2^ + b^2^)without intermediate overflow or underflow.static ColumnReturnscol2ifcol1is null, orcol1otherwise.static ColumnReturns true if str matchespatternwithescapeChar('\') case-insensitively, null if any arguments are null, false otherwise.static ColumnReturns true if str matchespatternwithescapeCharcase-insensitively, null if any arguments are null, false otherwise.static ColumnReturns a new string column by converting the first letter of each word to uppercase.static ColumnCreates a new row for each element in the given array of structs.static ColumnCreates a new row for each element in the given array of structs.static ColumnReturns the length of the block being read, or -1 if not available.static ColumnReturns the start offset of the block being read, or -1 if not available.static ColumnCreates a string column for the file name of the current Spark task.static ColumnLocate the position of the first occurrence of substr column in the given string.static ColumnLocate the position of the first occurrence of substr column in the given string.static Columnis_valid_utf8(Column str) Returns true if the input is a valid UTF-8 string, otherwise returns false.static ColumnCheck if a variant value is a variant null.static ColumnReturn true iff the column is NaN.static ColumnReturns true ifcolis not null, or false otherwise.static ColumnReturn true iff the column is null.static Columnjava_method(Column... cols) Calls a method with reflection.static Columnjava_method(scala.collection.immutable.Seq<Column> cols) Calls a method with reflection.static ColumnReturns the number of elements in the outermost JSON array.static ColumnReturns all the keys of the outermost JSON object as an array.static Columnjson_tuple(Column json, String... fields) Creates a new row for a json column according to the given field names.static Columnjson_tuple(Column json, scala.collection.immutable.Seq<String> fields) Creates a new row for a json column according to the given field names.static ColumnAggregate function: returns the kurtosis of the values in a group.static ColumnAggregate function: returns the kurtosis of the values in a group.static ColumnWindow function: returns the value that isoffsetrows before the current row, andnullif there is less thanoffsetrows before the current row.static ColumnWindow function: returns the value that isoffsetrows before the current row, anddefaultValueif there is less thanoffsetrows before the current row.static ColumnWindow function: returns the value that isoffsetrows before the current row, andnullif there is less thanoffsetrows before the current row.static ColumnWindow function: returns the value that isoffsetrows before the current row, anddefaultValueif there is less thanoffsetrows before the current row.static ColumnWindow function: returns the value that isoffsetrows before the current row, anddefaultValueif there is less thanoffsetrows before the current row.static ColumnAggregate function: returns the last value of the column in a group.static ColumnAggregate function: returns the last value of the column in a group.static ColumnAggregate function: returns the last value in a group.static ColumnAggregate function: returns the last value in a group.static ColumnReturns the last day of the month which the given date belongs to.static Columnlast_value(Column e) Aggregate function: returns the last value in a group.static Columnlast_value(Column e, Column ignoreNulls) Aggregate function: returns the last value in a group.static ColumnReturnsstrwith all characters changed to lowercase.static ColumnWindow function: returns the value that isoffsetrows after the current row, andnullif there is less thanoffsetrows after the current row.static ColumnWindow function: returns the value that isoffsetrows after the current row, anddefaultValueif there is less thanoffsetrows after the current row.static ColumnWindow function: returns the value that isoffsetrows after the current row, andnullif there is less thanoffsetrows after the current row.static ColumnWindow function: returns the value that isoffsetrows after the current row, anddefaultValueif there is less thanoffsetrows after the current row.static ColumnWindow function: returns the value that isoffsetrows after the current row, anddefaultValueif there is less thanoffsetrows after the current row.static ColumnReturns the least value of the list of column names, skipping null values.static ColumnReturns the least value of the list of column names, skipping null values.static ColumnReturns the least value of the list of values, skipping null values.static ColumnReturns the least value of the list of values, skipping null values.static ColumnReturns the leftmostlen(lencan be string type) characters from the stringstr, iflenis less or equal than 0 the result is an empty string.static ColumnComputes the character length of a given string or number of bytes of a binary string.static ColumnComputes the character length of a given string or number of bytes of a binary string.static Columnlevenshtein(Column l, Column r) Computes the Levenshtein distance of the two given string columns.static Columnlevenshtein(Column l, Column r, int threshold) Computes the Levenshtein distance of the two given string columns if it's less than or equal to a given threshold.static ColumnReturns true if str matchespatternwithescapeChar('\'), null if any arguments are null, false otherwise.static ColumnReturns true if str matchespatternwithescapeChar, null if any arguments are null, false otherwise.static ColumnAggregate function: returns the concatenation of non-null input values.static ColumnAggregate function: returns the concatenation of non-null input values, separated by the delimiter.static ColumnAggregate function: returns the concatenation of distinct non-null input values.static Columnlistagg_distinct(Column e, Column delimiter) Aggregate function: returns the concatenation of distinct non-null input values, separated by the delimiter.static ColumnCreates aColumnof literal value.static ColumnComputes the natural logarithm of the given value.static ColumnReturns the current timestamp without time zone at the start of query evaluation as a timestamp without time zone column.static ColumnLocate the position of the first occurrence of substr.static ColumnLocate the position of the first occurrence of substr in a string column, after position pos.static ColumnReturns the first argument-base logarithm of the second argument.static ColumnReturns the first argument-base logarithm of the second argument.static ColumnComputes the natural logarithm of the given column.static ColumnComputes the natural logarithm of the given value.static ColumnComputes the logarithm of the given value in base 10.static ColumnComputes the logarithm of the given value in base 10.static ColumnComputes the natural logarithm of the given column plus one.static ColumnComputes the natural logarithm of the given value plus one.static ColumnComputes the logarithm of the given value in base 2.static ColumnComputes the logarithm of the given column in base 2.static ColumnConverts a string column to lower case.static ColumnLeft-pad the binary column with pad to a byte length of len.static ColumnLeft-pad the string column with pad to a length of len.static ColumnLeft-pad the string column with pad to a length of len.static ColumnTrim the spaces from left end for the specified string value.static ColumnTrim the specified character string from left end for the specified string column.static ColumnTrim the specified character string from left end for the specified string column.static Columnstatic ColumnMake DayTimeIntervalType duration.static Columnmake_dt_interval(Column days) Make DayTimeIntervalType duration from days.static Columnmake_dt_interval(Column days, Column hours) Make DayTimeIntervalType duration from days and hours.static Columnmake_dt_interval(Column days, Column hours, Column mins) Make DayTimeIntervalType duration from days, hours and mins.static Columnmake_dt_interval(Column days, Column hours, Column mins, Column secs) Make DayTimeIntervalType duration from days, hours, mins and secs.static ColumnMake interval.static Columnmake_interval(Column years) Make interval from years.static Columnmake_interval(Column years, Column months) Make interval from years and months.static Columnmake_interval(Column years, Column months, Column weeks) Make interval from years, months and weeks.static Columnmake_interval(Column years, Column months, Column weeks, Column days) Make interval from years, months, weeks and days.static ColumnMake interval from years, months, weeks, days and hours.static ColumnMake interval from years, months, weeks, days, hours and mins.static Columnmake_interval(Column years, Column months, Column weeks, Column days, Column hours, Column mins, Column secs) Make interval from years, months, weeks, days, hours, mins and secs.static ColumnCreate time from hour, minute and second fields.static Columnmake_timestamp(Column date, Column time) Create a local date-time from date and time fields.static Columnmake_timestamp(Column date, Column time, Column timezone) Create a local date-time from date, time, and timezone fields.static ColumnCreate timestamp from years, months, days, hours, mins and secs fields.static Columnmake_timestamp(Column years, Column months, Column days, Column hours, Column mins, Column secs, Column timezone) Create timestamp from years, months, days, hours, mins, secs and timezone fields.static Columnmake_timestamp_ltz(Column years, Column months, Column days, Column hours, Column mins, Column secs) Create the current timestamp with local time zone from years, months, days, hours, mins and secs fields.static Columnmake_timestamp_ltz(Column years, Column months, Column days, Column hours, Column mins, Column secs, Column timezone) Create the current timestamp with local time zone from years, months, days, hours, mins, secs and timezone fields.static Columnmake_timestamp_ntz(Column date, Column time) Create a local date-time from date and time fields.static Columnmake_timestamp_ntz(Column years, Column months, Column days, Column hours, Column mins, Column secs) Create local date-time from years, months, days, hours, mins, secs fields.static Columnmake_valid_utf8(Column str) Returns a new string in which all invalid UTF-8 byte sequences, if any, are replaced by the Unicode replacement character (U+FFFD).static ColumnMake year-month interval.static Columnmake_ym_interval(Column years) Make year-month interval from years.static Columnmake_ym_interval(Column years, Column months) Make year-month interval from years, months.static ColumnCreates a new map column.static ColumnCreates a new map column.static Columnmap_concat(Column... cols) Returns the union of all the given maps.static Columnmap_concat(scala.collection.immutable.Seq<Column> cols) Returns the union of all the given maps.static Columnmap_contains_key(Column column, Object key) Returns true if the map contains the key.static ColumnReturns an unordered array of all entries in the given map.static Columnmap_filter(Column expr, scala.Function2<Column, Column, Column> f) Returns a map whose key-value pairs satisfy a predicate.static Columnmap_from_arrays(Column keys, Column values) Creates a new map column.static ColumnReturns a map created from the given array of entries.static ColumnReturns an unordered array containing the keys of the map.static Columnmap_values(Column e) Returns an unordered array containing the values of the map.static ColumnMerge two given maps, key-wise into a single map using a function.static ColumnMasks the given string value.static ColumnMasks the given string value.static ColumnMasks the given string value.static ColumnMasks the given string value.static ColumnMasks the given string value.static ColumnAggregate function: returns the maximum value of the column in a group.static ColumnAggregate function: returns the maximum value of the expression in a group.static ColumnAggregate function: returns the value associated with the maximum value of ord.static ColumnCalculates the MD5 digest of a binary column and returns the value as a 32 character hex string.static ColumnAggregate function: returns the average of the values in a group.static ColumnAggregate function: returns the average of the values in a group.static ColumnAggregate function: returns the median of the values in a group.static ColumnAggregate function: returns the minimum value of the column in a group.static ColumnAggregate function: returns the minimum value of the expression in a group.static ColumnAggregate function: returns the value associated with the minimum value of ord.static ColumnExtracts the minutes as an integer from a given date/time/timestamp/string.static ColumnAggregate function: returns the most frequent value in a group.static ColumnAggregate function: returns the most frequent value in a group.static ColumnA column expression that generates monotonically increasing 64-bit integers.static ColumnDeprecated.Use monotonically_increasing_id().static ColumnExtracts the month as an integer from a given date/timestamp/string.static ColumnExtracts the three-letter abbreviated month name from a given date/timestamp/string.static Column(Java-specific) A transform for timestamps and dates to partition data into months.static Columnmonths_between(Column end, Column start) Returns number of months between datesstartandend.static Columnmonths_between(Column end, Column start, boolean roundOff) Returns number of months between datesendandstart.static Columnnamed_struct(Column... cols) Creates a struct with the given field names and values.static Columnnamed_struct(scala.collection.immutable.Seq<Column> cols) Creates a struct with the given field names and values.static ColumnReturns col1 if it is not NaN, or col2 if col1 is NaN.static ColumnUnary minus, i.e.static ColumnReturns the negated value.static ColumnReturns the first date which is later than the value of thedatecolumn that is on the specified day of the week.static ColumnReturns the first date which is later than the value of thedatecolumn that is on the specified day of the week.static ColumnInversion of boolean expression, i.e.static Columnnow()Returns the current timestamp at the start of query evaluation.static ColumnWindow function: returns the value that is theoffsetth row of the window frame (counting from 1), andnullif the size of window frame is less thanoffsetrows.static ColumnWindow function: returns the value that is theoffsetth row of the window frame (counting from 1), andnullif the size of window frame is less thanoffsetrows.static Columnntile(int n) Window function: returns the ntile group id (from 1 toninclusive) in an ordered window partition.static ColumnReturns null ifcol1equals tocol2, orcol1otherwise.static Columnnullifzero(Column col) Returns null ifcolis equal to zero, orcolotherwise.static ColumnReturnscol2ifcol1is null, orcol1otherwise.static ColumnReturnscol2ifcol1is not null, orcol3otherwise.static ColumnCalculates the byte length for the specified string column.static ColumnOverlay the specified portion ofsrcwithreplace, starting from byte positionposofsrc.static ColumnOverlay the specified portion ofsrcwithreplace, starting from byte positionposofsrcand proceeding forlenbytes.static Columnparse_json(Column json) Parses a JSON string and constructs a Variant value.static ColumnExtracts a part from a URL.static ColumnExtracts a part from a URL.static ColumnWindow function: returns the relative rank (i.e.static Columnpercentile(Column e, Column percentage) Aggregate function: returns the exact percentile(s) of numeric columnexprat the given percentage(s) with value range in [0.0, 1.0].static Columnpercentile(Column e, Column percentage, Column frequency) Aggregate function: returns the exact percentile(s) of numeric columnexprat the given percentage(s) with value range in [0.0, 1.0].static Columnpercentile_approx(Column e, Column percentage, Column accuracy) Aggregate function: returns the approximatepercentileof the numeric columncolwhich is the smallest value in the orderedcolvalues (sorted from least to greatest) such that no more thanpercentageofcolvalues is less than the value or equal to that value.static Columnpi()Returns Pi.static ColumnReturns the positive value of dividend mod divisor.static Columnposexplode(Column e) Creates a new row for each element with position in the given array or map column.static ColumnCreates a new row for each element with position in the given array or map column.static ColumnReturns the position of the first occurrence ofsubstrinstrafter position1.static ColumnReturns the position of the first occurrence ofsubstrinstrafter positionstart.static ColumnReturns the value.static ColumnReturns the value of the first argument raised to the power of the second argument.static ColumnReturns the value of the first argument raised to the power of the second argument.static ColumnReturns the value of the first argument raised to the power of the second argument.static ColumnReturns the value of the first argument raised to the power of the second argument.static ColumnReturns the value of the first argument raised to the power of the second argument.static ColumnReturns the value of the first argument raised to the power of the second argument.static ColumnReturns the value of the first argument raised to the power of the second argument.static ColumnReturns the value of the first argument raised to the power of the second argument.static ColumnReturns the value of the first argument raised to the power of the second argument.static ColumnFormats the arguments in printf-style and returns the result as a string column.static ColumnFormats the arguments in printf-style and returns the result as a string column.static ColumnAggregate function: returns the product of all numerical elements in a group.static ColumnExtracts the quarter as an integer from a given date/timestamp/string.static ColumnReturnsstrenclosed by single quotes and each instance of single quote in it is preceded by a backslash.static ColumnConverts an angle measured in degrees to an approximately equivalent angle measured in radians.static ColumnConverts an angle measured in degrees to an approximately equivalent angle measured in radians.static ColumnThrows an exception with the provided error message.static Columnrand()Generate a random column with independent and identically distributed (i.i.d.) samples uniformly distributed in [0.0, 1.0).static Columnrand(long seed) Generate a random column with independent and identically distributed (i.i.d.) samples uniformly distributed in [0.0, 1.0).static Columnrandn()Generate a column with independent and identically distributed (i.i.d.) samples from the standard normal distribution.static Columnrandn(long seed) Generate a column with independent and identically distributed (i.i.d.) samples from the standard normal distribution.static Columnrandom()Returns a random value with independent and identically distributed (i.i.d.) uniformly distributed values in [0, 1).static ColumnReturns a random value with independent and identically distributed (i.i.d.) uniformly distributed values in [0, 1).static ColumnReturns a string of the specified length whose characters are chosen uniformly at random from the following pool of characters: 0-9, a-z, A-Z.static ColumnReturns a string of the specified length whose characters are chosen uniformly at random from the following pool of characters: 0-9, a-z, A-Z, with the chosen random seed.static Columnrank()Window function: returns the rank of rows within a window partition.static ColumnApplies a binary operator to an initial state and all elements in the array, and reduces this to a single state.static Columnreduce(Column expr, Column initialValue, scala.Function2<Column, Column, Column> merge, scala.Function1<Column, Column> finish) Applies a binary operator to an initial state and all elements in the array, and reduces this to a single state.static ColumnCalls a method with reflection.static ColumnCalls a method with reflection.static ColumnReturns true ifstrmatchesregexp, or false otherwise.static Columnregexp_count(Column str, Column regexp) Returns a count of the number of times that the regular expression patternregexpis matched in the stringstr.static Columnregexp_extract(Column e, String exp, int groupIdx) Extract a specific group matched by a Java regex, from the specified string column.static Columnregexp_extract_all(Column str, Column regexp) Extract all strings in thestrthat match theregexpexpression and corresponding to the first regex group index.static Columnregexp_extract_all(Column str, Column regexp, Column idx) Extract all strings in thestrthat match theregexpexpression and corresponding to the regex group index.static Columnregexp_instr(Column str, Column regexp) Searches a string for a regular expression and returns an integer that indicates the beginning position of the matched substring.static Columnregexp_instr(Column str, Column regexp, Column idx) Searches a string for a regular expression and returns an integer that indicates the beginning position of the matched substring.static Columnregexp_like(Column str, Column regexp) Returns true ifstrmatchesregexp, or false otherwise.static Columnregexp_replace(Column e, String pattern, String replacement) Replace all substrings of the specified string value that match regexp with rep.static Columnregexp_replace(Column e, Column pattern, Column replacement) Replace all substrings of the specified string value that match regexp with rep.static Columnregexp_substr(Column str, Column regexp) Returns the substring that matches the regular expressionregexpwithin the stringstr.static ColumnAggregate function: returns the average of the independent variable for non-null pairs in a group, whereyis the dependent variable andxis the independent variable.static ColumnAggregate function: returns the average of the independent variable for non-null pairs in a group, whereyis the dependent variable andxis the independent variable.static Columnregr_count(Column y, Column x) Aggregate function: returns the number of non-null number pairs in a group, whereyis the dependent variable andxis the independent variable.static Columnregr_intercept(Column y, Column x) Aggregate function: returns the intercept of the univariate linear regression line for non-null pairs in a group, whereyis the dependent variable andxis the independent variable.static ColumnAggregate function: returns the coefficient of determination for non-null pairs in a group, whereyis the dependent variable andxis the independent variable.static Columnregr_slope(Column y, Column x) Aggregate function: returns the slope of the linear regression line for non-null pairs in a group, whereyis the dependent variable andxis the independent variable.static ColumnAggregate function: returns REGR_COUNT(y, x) * VAR_POP(x) for non-null pairs in a group, whereyis the dependent variable andxis the independent variable.static ColumnAggregate function: returns REGR_COUNT(y, x) * COVAR_POP(y, x) for non-null pairs in a group, whereyis the dependent variable andxis the independent variable.static ColumnAggregate function: returns REGR_COUNT(y, x) * VAR_POP(y) for non-null pairs in a group, whereyis the dependent variable andxis the independent variable.static ColumnRepeats a string column n times, and returns it as a new string column.static ColumnRepeats a string column n times, and returns it as a new string column.static ColumnReplaces all occurrences ofsearchwithreplace.static ColumnReplaces all occurrences ofsearchwithreplace.static ColumnReturns a reversed string or an array with reverse order of elements.static ColumnReturns the rightmostlen(lencan be string type) characters from the stringstr, iflenis less or equal than 0 the result is an empty string.static ColumnReturns the double value that is closest in value to the argument and is equal to a mathematical integer.static ColumnReturns the double value that is closest in value to the argument and is equal to a mathematical integer.static ColumnReturns true ifstrmatchesregexp, or false otherwise.static ColumnReturns the value of the columnerounded to 0 decimal places with HALF_UP round mode.static ColumnRound the value ofetoscaledecimal places with HALF_UP round mode ifscaleis greater than or equal to 0 or at integral part whenscaleis less than 0.static ColumnRound the value ofetoscaledecimal places with HALF_UP round mode ifscaleis greater than or equal to 0 or at integral part whenscaleis less than 0.static ColumnWindow function: returns a sequential number starting at 1 within a window partition.static ColumnRight-pad the binary column with pad to a byte length of len.static ColumnRight-pad the string column with pad to a length of len.static ColumnRight-pad the string column with pad to a length of len.static ColumnTrim the spaces from right end for the specified string value.static ColumnTrim the specified character string from right end for the specified string column.static ColumnTrim the specified character string from right end for the specified string column.static Columnschema_of_csv(String csv) Parses a CSV string and infers its schema in DDL format.static Columnschema_of_csv(Column csv) Parses a CSV string and infers its schema in DDL format.static Columnschema_of_csv(Column csv, Map<String, String> options) Parses a CSV string and infers its schema in DDL format using options.static Columnschema_of_json(String json) Parses a JSON string and infers its schema in DDL format.static Columnschema_of_json(Column json) Parses a JSON string and infers its schema in DDL format.static Columnschema_of_json(Column json, Map<String, String> options) Parses a JSON string and infers its schema in DDL format using options.static ColumnReturns schema in the SQL format of a variant.static ColumnReturns the merged schema in the SQL format of a variant column.static Columnschema_of_xml(String xml) Parses a XML string and infers its schema in DDL format.static Columnschema_of_xml(Column xml) Parses a XML string and infers its schema in DDL format.static Columnschema_of_xml(Column xml, Map<String, String> options) Parses a XML string and infers its schema in DDL format using options.static Columnstatic ColumnExtracts the seconds as an integer from a given date/time/timestamp/string.static ColumnSplits a string into arrays of sentences, where each sentence is an array of words.static ColumnSplits a string into arrays of sentences, where each sentence is an array of words.static ColumnSplits a string into arrays of sentences, where each sentence is an array of words.static ColumnGenerate a sequence of integers from start to stop, incrementing by 1 if start is less than or equal to stop, otherwise -1.static ColumnGenerate a sequence of integers from start to stop, incrementing by step.static ColumnReturns the user name of current execution context.static Columnsession_window(Column timeColumn, String gapDuration) Generates session window given a timestamp specifying column.static Columnsession_window(Column timeColumn, Column gapDuration) Generates session window given a timestamp specifying column.static ColumnReturns a sha1 hash value as a hex string of thecol.static ColumnCalculates the SHA-1 digest of a binary column and returns the value as a 40 character hex string.static ColumnCalculates the SHA-2 family of hash functions of a binary column and returns the value as a hex string.static ColumnShift the given value numBits left.static ColumnDeprecated.Use shiftleft.static Columnshiftright(Column e, int numBits) (Signed) shift the given value numBits right.static ColumnshiftRight(Column e, int numBits) Deprecated.Use shiftright.static Columnshiftrightunsigned(Column e, int numBits) Unsigned shift the given value numBits right.static ColumnshiftRightUnsigned(Column e, int numBits) Deprecated.Use shiftrightunsigned.static ColumnReturns a random permutation of the given array.static ColumnReturns a random permutation of the given array.static ColumnComputes the signum of the given value.static ColumnComputes the signum of the given column.static ColumnComputes the signum of the given value.static Columnstatic Columnstatic Columnstatic Columnstatic ColumnReturns length of array or map.static ColumnAggregate function: returns the skewness of the values in a group.static ColumnAggregate function: returns the skewness of the values in a group.static ColumnReturns an array containing all the elements inxfrom indexstart(or starting from the end ifstartis negative) with the specifiedlength.static ColumnReturns an array containing all the elements inxfrom indexstart(or starting from the end ifstartis negative) with the specifiedlength.static ColumnAggregate function: returns true if at least one value ofeis true.static Columnsort_array(Column e) Sorts the input array for the given column in ascending order, according to the natural ordering of the array elements.static Columnsort_array(Column e, boolean asc) Sorts the input array for the given column in ascending or descending order, according to the natural ordering of the array elements.static ColumnReturns the soundex code for the specified expression.static ColumnPartition ID.static ColumnSplits str around matches of the given pattern.static ColumnSplits str around matches of the given pattern.static ColumnSplits str around matches of the given pattern.static ColumnSplits str around matches of the given pattern.static Columnsplit_part(Column str, Column delimiter, Column partNum) Splitsstrby delimiter and return requested part of the split (1-based).static ColumnComputes the square root of the specified float value.static ColumnComputes the square root of the specified float value.static ColumnSeparatescol1, ...,colkintonrows.static ColumnSeparatescol1, ...,colkintonrows.static Columnstartswith(Column str, Column prefix) Returns a boolean.static ColumnAggregate function: alias forstddev_samp.static ColumnAggregate function: alias forstddev_samp.static ColumnAggregate function: alias forstddev_samp.static Columnstddev_pop(String columnName) Aggregate function: returns the population standard deviation of the expression in a group.static Columnstddev_pop(Column e) Aggregate function: returns the population standard deviation of the expression in a group.static Columnstddev_samp(String columnName) Aggregate function: returns the sample standard deviation of the expression in a group.static ColumnAggregate function: returns the sample standard deviation of the expression in a group.static Columnstr_to_map(Column text) Creates a map after splitting the text into key/value pairs using delimiters.static Columnstr_to_map(Column text, Column pairDelim) Creates a map after splitting the text into key/value pairs using delimiters.static Columnstr_to_map(Column text, Column pairDelim, Column keyValueDelim) Creates a map after splitting the text into key/value pairs using delimiters.static Columnstring_agg(Column e) Aggregate function: returns the concatenation of non-null input values.static Columnstring_agg(Column e, Column delimiter) Aggregate function: returns the concatenation of non-null input values, separated by the delimiter.static ColumnAggregate function: returns the concatenation of distinct non-null input values.static Columnstring_agg_distinct(Column e, Column delimiter) Aggregate function: returns the concatenation of distinct non-null input values, separated by the delimiter.static ColumnCreates a new struct column that composes multiple input columns.static ColumnCreates a new struct column that composes multiple input columns.static ColumnCreates a new struct column.static ColumnCreates a new struct column.static ColumnReturns the substring ofstrthat starts atpos, or the slice of byte array that starts atpos.static ColumnReturns the substring ofstrthat starts atposand is of lengthlen, or the slice of byte array that starts atposand is of lengthlen.static ColumnSubstring starts atposand is of lengthlenwhen str is String type or returns the slice of byte array that starts atposin byte and is of lengthlenwhen str is Binary typestatic ColumnSubstring starts atposand is of lengthlenwhen str is String type or returns the slice of byte array that starts atposin byte and is of lengthlenwhen str is Binary typestatic Columnsubstring_index(Column str, String delim, int count) Returns the substring from string str before count occurrences of the delimiter delim.static ColumnAggregate function: returns the sum of all values in the given column.static ColumnAggregate function: returns the sum of all values in the expression.static ColumnAggregate function: returns the sum of distinct values in the expression.static ColumnsumDistinct(String columnName) Deprecated.Use sum_distinct.static ColumnDeprecated.Use sum_distinct.static Columnstatic Columnstatic Columnstatic Columnstatic Columntheta_difference(String columnName1, String columnName2) Subtracts two binary representations of Datasketches ThetaSketch objects in the input columns using a Datasketches AnotB objectstatic Columntheta_difference(Column c1, Column c2) Subtracts two binary representations of Datasketches ThetaSketch objects in the input columns using a Datasketches AnotB objectstatic Columntheta_intersection(String columnName1, String columnName2) Intersects two binary representations of Datasketches ThetaSketch objects in the input columns using a Datasketches Intersection objectstatic Columntheta_intersection(Column c1, Column c2) Intersects two binary representations of Datasketches ThetaSketch objects in the input columns using a Datasketches Intersection objectstatic Columntheta_intersection_agg(String columnName) Aggregate function: returns the compact binary representation of the Datasketches ThetaSketch, generated by intersecting the Datasketches ThetaSketch instances in the input volumn via a Datasketches Intersection instance.static ColumnAggregate function: returns the compact binary representation of the Datasketches ThetaSketch, generated by intersecting the Datasketches ThetaSketch instances in the input column via a Datasketches Intersection instance.static Columntheta_sketch_agg(String columnName) Aggregate function: returns the compact binary representation of the Datasketches ThetaSketch built with the values in the input column and configured with the default value of 12 forlgNomEntries.static Columntheta_sketch_agg(String columnName, int lgNomEntries) Aggregate function: returns the compact binary representation of the Datasketches ThetaSketch built with the values in the input column and configured with thelgNomEntriesnominal entries.static ColumnAggregate function: returns the compact binary representation of the Datasketches ThetaSketch built with the values in the input column and configured with the default value of 12 forlgNomEntries.static Columntheta_sketch_agg(Column e, int lgNomEntries) Aggregate function: returns the compact binary representation of the Datasketches ThetaSketch built with the values in the input column and configured with thelgNomEntriesnominal entries.static Columntheta_sketch_agg(Column e, Column lgNomEntries) Aggregate function: returns the compact binary representation of the Datasketches ThetaSketch built with the values in the input column and configured with thelgNomEntriesnominal entries.static Columntheta_sketch_estimate(String columnName) Returns the estimated number of unique values given the binary representation of a Datasketches ThetaSketch.static ColumnReturns the estimated number of unique values given the binary representation of a Datasketches ThetaSketch.static Columntheta_union(String columnName1, String columnName2) Unions two binary representations of Datasketches ThetaSketch objects in the input columns using a Datasketches Union object.static Columntheta_union(String columnName1, String columnName2, int lgNomEntries) Unions two binary representations of Datasketches ThetaSketch objects in the input columns using a Datasketches Union object.static Columntheta_union(Column c1, Column c2) Unions two binary representations of Datasketches ThetaSketch objects in the input columns using a Datasketches Union object.static Columntheta_union(Column c1, Column c2, int lgNomEntries) Unions two binary representations of Datasketches ThetaSketch objects in the input columns using a Datasketches Union object.static Columntheta_union(Column c1, Column c2, Column lgNomEntries) Unions two binary representations of Datasketches ThetaSketch objects in the input columns using a Datasketches Union object.static Columntheta_union_agg(String columnName) Aggregate function: returns the compact binary representation of the Datasketches ThetaSketch, generated by the union of Datasketches ThetaSketch instances in the input column via a Datasketches Union instance.static Columntheta_union_agg(String columnName, int lgNomEntries) Aggregate function: returns the compact binary representation of the Datasketches ThetaSketch, generated by the union of Datasketches ThetaSketch instances in the input column via a Datasketches Union instance.static ColumnAggregate function: returns the compact binary representation of the Datasketches ThetaSketch, generated by the union of Datasketches ThetaSketch instances in the input column via a Datasketches Union instance.static Columntheta_union_agg(Column e, int lgNomEntries) Aggregate function: returns the compact binary representation of the Datasketches ThetaSketch, generated by the union of Datasketches ThetaSketch instances in the input column via a Datasketches Union instance.static Columntheta_union_agg(Column e, Column lgNomEntries) Aggregate function: returns the compact binary representation of the Datasketches ThetaSketch, generated by the union of Datasketches ThetaSketch instances in the input column via a Datasketches Union instance.static ColumnReturns the difference between two times, measured in specified units.static Columntime_trunc(Column unit, Column time) Returnstimetruncated to theunit.static Columntimestamp_add(String unit, Column quantity, Column ts) Adds the specified number of units to the given timestamp.static Columntimestamp_diff(String unit, Column start, Column end) Gets the difference between the timestamps in the specified units by truncating the fraction part.static ColumnCreates timestamp from the number of microseconds since UTC epoch.static ColumnCreates timestamp from the number of milliseconds since UTC epoch.static ColumnConverts the number of seconds from the Unix epoch (1970-01-01T00:00:00Z) to a timestamp.static ColumnConverts the inputeto a binary value based on the default format "hex".static ColumnConverts the inputeto a binary value based on the suppliedformat.static ColumnConverteto a string based on theformat.static ColumnConverts a column containing aStructTypeinto a CSV string with the specified schema.static Column(Java-specific) Converts a column containing aStructTypeinto a CSV string with the specified schema.static ColumnConverts the column intoDateTypeby casting rules toDateType.static ColumnConverts the column into aDateTypewith a specified formatstatic ColumnConverts a column containing aStructType,ArrayTypeor aMapTypeinto a JSON string with the specified schema.static Column(Java-specific) Converts a column containing aStructType,ArrayTypeor aMapTypeinto a JSON string with the specified schema.static Column(Scala-specific) Converts a column containing aStructType,ArrayTypeor aMapTypeinto a JSON string with the specified schema.static ColumnConvert string 'e' to a number based on the string format 'format'.static ColumnParses a string value to a time value.static ColumnParses a string value to a time value.static ColumnConverts to a timestamp by casting rules toTimestampType.static Columnto_timestamp(Column s, String fmt) Converts time string with the given pattern to timestamp.static Columnto_timestamp_ltz(Column timestamp) Parses thetimestampexpression with the default format to a timestamp without time zone.static Columnto_timestamp_ltz(Column timestamp, Column format) Parses thetimestampexpression with theformatexpression to a timestamp without time zone.static Columnto_timestamp_ntz(Column timestamp) Parses thetimestampexpression with the default format to a timestamp without time zone.static Columnto_timestamp_ntz(Column timestamp, Column format) Parses thetimestamp_strexpression with theformatexpression to a timestamp without time zone.static Columnto_unix_timestamp(Column timeExp) Returns the UNIX timestamp of the given time.static Columnto_unix_timestamp(Column timeExp, Column format) Returns the UNIX timestamp of the given time.static Columnto_utc_timestamp(Column ts, String tz) Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC.static Columnto_utc_timestamp(Column ts, Column tz) Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC.static Columnto_varchar(Column e, Column format) Converteto a string based on theformat.static Columnto_variant_object(Column col) Converts a column containing nested inputs (array/map/struct) into a variants where maps and structs are converted to variant objects which are unordered unlike SQL structs.static ColumnConverts a column containing aStructTypeinto a XML string with the specified schema.static Column(Java-specific) Converts a column containing aStructTypeinto a XML string with the specified schema.static ColumnDeprecated.Use degrees.static ColumnDeprecated.Use degrees.static ColumnDeprecated.Use radians.static ColumnDeprecated.Use radians.static ColumnReturns an array of elements after applying a transformation to each element in the input array.static ColumnReturns an array of elements after applying a transformation to each element in the input array.static Columntransform_keys(Column expr, scala.Function2<Column, Column, Column> f) Applies a function to every key-value pair in a map and returns a map with the results of those applications as the new keys for the pairs.static Columntransform_values(Column expr, scala.Function2<Column, Column, Column> f) Applies a function to every key-value pair in a map and returns a map with the results of those applications as the new values for the pairs.static ColumnTranslate any character in the src by a character in replaceString.static ColumnTrim the spaces from both ends for the specified string column.static ColumnTrim the specified character from both ends for the specified string column.static ColumnTrim the specified character from both ends for the specified string column.static ColumnReturns date truncated to the unit specified by the format.static ColumnReturns the sum ofleftandrightand the result is null on overflow.static Columntry_aes_decrypt(Column input, Column key) Returns a decrypted value ofinput.static Columntry_aes_decrypt(Column input, Column key, Column mode) Returns a decrypted value ofinput.static Columntry_aes_decrypt(Column input, Column key, Column mode, Column padding) Returns a decrypted value ofinput.static ColumnThis is a special version ofaes_decryptthat performs the same operation, but returns a NULL value instead of raising an error if the decryption cannot be performed.static ColumnReturns the mean calculated from values of a group and the result is null on overflow.static Columntry_divide(Column left, Column right) Returnsdividend/divisor.static Columntry_element_at(Column column, Column value) (array, index) - Returns element of array at given (1-based) index.static Columntry_make_interval(Column years) This is a special version ofmake_intervalthat performs the same operation, but returns a NULL value instead of raising an error if interval cannot be created.static Columntry_make_interval(Column years, Column months) This is a special version ofmake_intervalthat performs the same operation, but returns a NULL value instead of raising an error if interval cannot be created.static Columntry_make_interval(Column years, Column months, Column weeks) This is a special version ofmake_intervalthat performs the same operation, but returns a NULL value instead of raising an error if interval cannot be created.static Columntry_make_interval(Column years, Column months, Column weeks, Column days) This is a special version ofmake_intervalthat performs the same operation, but returns a NULL value instead of raising an error if interval cannot be created.static ColumnThis is a special version ofmake_intervalthat performs the same operation, but returns a NULL value instead of raising an error if interval cannot be created.static Columntry_make_interval(Column years, Column months, Column weeks, Column days, Column hours, Column mins) This is a special version ofmake_intervalthat performs the same operation, but returns a NULL value instead of raising an error if interval cannot be created.static Columntry_make_interval(Column years, Column months, Column weeks, Column days, Column hours, Column mins, Column secs) This is a special version ofmake_intervalthat performs the same operation, but returns a NULL value instead of raising an error if interval cannot be created.static Columntry_make_timestamp(Column date, Column time) Try to create a local date-time from date and time fields.static Columntry_make_timestamp(Column date, Column time, Column timezone) Try to create a local date-time from date, time, and timezone fields.static Columntry_make_timestamp(Column years, Column months, Column days, Column hours, Column mins, Column secs) Try to create a timestamp from years, months, days, hours, mins, and secs fields.static Columntry_make_timestamp(Column years, Column months, Column days, Column hours, Column mins, Column secs, Column timezone) Try to create a timestamp from years, months, days, hours, mins, secs and timezone fields.static Columntry_make_timestamp_ltz(Column years, Column months, Column days, Column hours, Column mins, Column secs) Try to create the current timestamp with local time zone from years, months, days, hours, mins and secs fields.static Columntry_make_timestamp_ltz(Column years, Column months, Column days, Column hours, Column mins, Column secs, Column timezone) Try to create the current timestamp with local time zone from years, months, days, hours, mins, secs and timezone fields.static Columntry_make_timestamp_ntz(Column date, Column time) Try to create a local date-time from date and time fields.static Columntry_make_timestamp_ntz(Column years, Column months, Column days, Column hours, Column mins, Column secs) Try to create a local date-time from years, months, days, hours, mins, secs fields.static ColumnReturns the remainder ofdividend/divisor.static Columntry_multiply(Column left, Column right) Returnsleft*rightand the result is null on overflow.static Columntry_parse_json(Column json) Parses a JSON string and constructs a Variant value.static Columntry_parse_url(Column url, Column partToExtract) Extracts a part from a URL.static Columntry_parse_url(Column url, Column partToExtract, Column key) Extracts a part from a URL.static Columntry_reflect(Column... cols) This is a special version ofreflectthat performs the same operation, but returns a NULL value instead of raising an error if the invoke method thrown exception.static Columntry_reflect(scala.collection.immutable.Seq<Column> cols) This is a special version ofreflectthat performs the same operation, but returns a NULL value instead of raising an error if the invoke method thrown exception.static Columntry_subtract(Column left, Column right) Returnsleft-rightand the result is null on overflow.static ColumnReturns the sum calculated from values of a group and the result is null on overflow.static ColumnThis is a special version ofto_binarythat performs the same operation, but returns a NULL value instead of raising an error if the conversion cannot be performed.static Columntry_to_binary(Column e, Column f) This is a special version ofto_binarythat performs the same operation, but returns a NULL value instead of raising an error if the conversion cannot be performed.static ColumnThis is a special version ofto_datethat performs the same operation, but returns a NULL value instead of raising an error if date cannot be created.static Columntry_to_date(Column e, String fmt) This is a special version ofto_datethat performs the same operation, but returns a NULL value instead of raising an error if date cannot be created.static Columntry_to_number(Column e, Column format) Convert stringeto a number based on the string formatformat.static Columntry_to_time(Column str) Parses a string value to a time value.static Columntry_to_time(Column str, Column format) Parses a string value to a time value.static ColumnParses thesto a timestamp.static Columntry_to_timestamp(Column s, Column format) Parses theswith theformatto a timestamp.static Columntry_url_decode(Column str) This is a special version ofurl_decodethat performs the same operation, but returns a NULL value instead of raising an error if the decoding cannot be performed.static Columntry_validate_utf8(Column str) Returns the input value if it corresponds to a valid UTF-8 string, or NULL otherwise.static Columntry_variant_get(Column v, String path, String targetType) Extracts a sub-variant fromvaccording topathstring, and then cast the sub-variant totargetType.static Columntry_variant_get(Column v, Column path, String targetType) Extracts a sub-variant fromvaccording topathcolumn, and then cast the sub-variant totargetType.static <T> Columntypedlit(T literal, scala.reflect.api.TypeTags.TypeTag<T> evidence$2) Creates aColumnof literal value.static <T> ColumntypedLit(T literal, scala.reflect.api.TypeTags.TypeTag<T> evidence$1) Creates aColumnof literal value.static ColumnReturn DDL-formatted type string for the data type of the input.static ColumnReturnsstrwith all characters changed to uppercase.static <IN,BUF, OUT> 
 UserDefinedFunctionudaf(Aggregator<IN, BUF, OUT> agg, Encoder<IN> inputEncoder) Obtains aUserDefinedFunctionthat wraps the givenAggregatorso that it may be used with untyped Data Frames.static <IN,BUF, OUT> 
 UserDefinedFunctionudaf(Aggregator<IN, BUF, OUT> agg, scala.reflect.api.TypeTags.TypeTag<IN> evidence$3) Obtains aUserDefinedFunctionthat wraps the givenAggregatorso that it may be used with untyped Data Frames.static UserDefinedFunctionDeprecated.Scala `udf` method with return type parameter is deprecated.static UserDefinedFunctionDefines a Java UDF0 instance as user-defined function (UDF).static UserDefinedFunctionDefines a Java UDF1 instance as user-defined function (UDF).static UserDefinedFunctionDefines a Java UDF10 instance as user-defined function (UDF).static UserDefinedFunctionDefines a Java UDF2 instance as user-defined function (UDF).static UserDefinedFunctionDefines a Java UDF3 instance as user-defined function (UDF).static UserDefinedFunctionDefines a Java UDF4 instance as user-defined function (UDF).static UserDefinedFunctionDefines a Java UDF5 instance as user-defined function (UDF).static UserDefinedFunctionDefines a Java UDF6 instance as user-defined function (UDF).static UserDefinedFunctionDefines a Java UDF7 instance as user-defined function (UDF).static UserDefinedFunctionDefines a Java UDF8 instance as user-defined function (UDF).static UserDefinedFunctionDefines a Java UDF9 instance as user-defined function (UDF).static <RT> UserDefinedFunctionudf(scala.Function0<RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$4) Defines a Scala closure of 0 arguments as user-defined function (UDF).static <RT,A1> UserDefinedFunction udf(scala.Function1<A1, RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$5, scala.reflect.api.TypeTags.TypeTag<A1> evidence$6) Defines a Scala closure of 1 arguments as user-defined function (UDF).static <RT,A1, A2, A3, A4, A5, A6, A7, A8, A9, A10> 
 UserDefinedFunctionudf(scala.Function10<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$59, scala.reflect.api.TypeTags.TypeTag<A1> evidence$60, scala.reflect.api.TypeTags.TypeTag<A2> evidence$61, scala.reflect.api.TypeTags.TypeTag<A3> evidence$62, scala.reflect.api.TypeTags.TypeTag<A4> evidence$63, scala.reflect.api.TypeTags.TypeTag<A5> evidence$64, scala.reflect.api.TypeTags.TypeTag<A6> evidence$65, scala.reflect.api.TypeTags.TypeTag<A7> evidence$66, scala.reflect.api.TypeTags.TypeTag<A8> evidence$67, scala.reflect.api.TypeTags.TypeTag<A9> evidence$68, scala.reflect.api.TypeTags.TypeTag<A10> evidence$69) Defines a Scala closure of 10 arguments as user-defined function (UDF).static <RT,A1, A2> UserDefinedFunction udf(scala.Function2<A1, A2, RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$7, scala.reflect.api.TypeTags.TypeTag<A1> evidence$8, scala.reflect.api.TypeTags.TypeTag<A2> evidence$9) Defines a Scala closure of 2 arguments as user-defined function (UDF).static <RT,A1, A2, A3> 
 UserDefinedFunctionudf(scala.Function3<A1, A2, A3, RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$10, scala.reflect.api.TypeTags.TypeTag<A1> evidence$11, scala.reflect.api.TypeTags.TypeTag<A2> evidence$12, scala.reflect.api.TypeTags.TypeTag<A3> evidence$13) Defines a Scala closure of 3 arguments as user-defined function (UDF).static <RT,A1, A2, A3, A4> 
 UserDefinedFunctionudf(scala.Function4<A1, A2, A3, A4, RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$14, scala.reflect.api.TypeTags.TypeTag<A1> evidence$15, scala.reflect.api.TypeTags.TypeTag<A2> evidence$16, scala.reflect.api.TypeTags.TypeTag<A3> evidence$17, scala.reflect.api.TypeTags.TypeTag<A4> evidence$18) Defines a Scala closure of 4 arguments as user-defined function (UDF).static <RT,A1, A2, A3, A4, A5> 
 UserDefinedFunctionudf(scala.Function5<A1, A2, A3, A4, A5, RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$19, scala.reflect.api.TypeTags.TypeTag<A1> evidence$20, scala.reflect.api.TypeTags.TypeTag<A2> evidence$21, scala.reflect.api.TypeTags.TypeTag<A3> evidence$22, scala.reflect.api.TypeTags.TypeTag<A4> evidence$23, scala.reflect.api.TypeTags.TypeTag<A5> evidence$24) Defines a Scala closure of 5 arguments as user-defined function (UDF).static <RT,A1, A2, A3, A4, A5, A6> 
 UserDefinedFunctionudf(scala.Function6<A1, A2, A3, A4, A5, A6, RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$25, scala.reflect.api.TypeTags.TypeTag<A1> evidence$26, scala.reflect.api.TypeTags.TypeTag<A2> evidence$27, scala.reflect.api.TypeTags.TypeTag<A3> evidence$28, scala.reflect.api.TypeTags.TypeTag<A4> evidence$29, scala.reflect.api.TypeTags.TypeTag<A5> evidence$30, scala.reflect.api.TypeTags.TypeTag<A6> evidence$31) Defines a Scala closure of 6 arguments as user-defined function (UDF).static <RT,A1, A2, A3, A4, A5, A6, A7> 
 UserDefinedFunctionudf(scala.Function7<A1, A2, A3, A4, A5, A6, A7, RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$32, scala.reflect.api.TypeTags.TypeTag<A1> evidence$33, scala.reflect.api.TypeTags.TypeTag<A2> evidence$34, scala.reflect.api.TypeTags.TypeTag<A3> evidence$35, scala.reflect.api.TypeTags.TypeTag<A4> evidence$36, scala.reflect.api.TypeTags.TypeTag<A5> evidence$37, scala.reflect.api.TypeTags.TypeTag<A6> evidence$38, scala.reflect.api.TypeTags.TypeTag<A7> evidence$39) Defines a Scala closure of 7 arguments as user-defined function (UDF).static <RT,A1, A2, A3, A4, A5, A6, A7, A8> 
 UserDefinedFunctionudf(scala.Function8<A1, A2, A3, A4, A5, A6, A7, A8, RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$40, scala.reflect.api.TypeTags.TypeTag<A1> evidence$41, scala.reflect.api.TypeTags.TypeTag<A2> evidence$42, scala.reflect.api.TypeTags.TypeTag<A3> evidence$43, scala.reflect.api.TypeTags.TypeTag<A4> evidence$44, scala.reflect.api.TypeTags.TypeTag<A5> evidence$45, scala.reflect.api.TypeTags.TypeTag<A6> evidence$46, scala.reflect.api.TypeTags.TypeTag<A7> evidence$47, scala.reflect.api.TypeTags.TypeTag<A8> evidence$48) Defines a Scala closure of 8 arguments as user-defined function (UDF).static <RT,A1, A2, A3, A4, A5, A6, A7, A8, A9> 
 UserDefinedFunctionudf(scala.Function9<A1, A2, A3, A4, A5, A6, A7, A8, A9, RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$49, scala.reflect.api.TypeTags.TypeTag<A1> evidence$50, scala.reflect.api.TypeTags.TypeTag<A2> evidence$51, scala.reflect.api.TypeTags.TypeTag<A3> evidence$52, scala.reflect.api.TypeTags.TypeTag<A4> evidence$53, scala.reflect.api.TypeTags.TypeTag<A5> evidence$54, scala.reflect.api.TypeTags.TypeTag<A6> evidence$55, scala.reflect.api.TypeTags.TypeTag<A7> evidence$56, scala.reflect.api.TypeTags.TypeTag<A8> evidence$57, scala.reflect.api.TypeTags.TypeTag<A9> evidence$58) Defines a Scala closure of 9 arguments as user-defined function (UDF).static ColumnDecodes a BASE64 encoded string column and returns it as a binary column.static ColumnInverse of hex.static ColumnReturns a random value with independent and identically distributed (i.i.d.) values with the specified range of numbers.static ColumnReturns a random value with independent and identically distributed (i.i.d.) values with the specified range of numbers, with the chosen random seed.static ColumnReturns the number of days since 1970-01-01.static ColumnReturns the number of microseconds since 1970-01-01 00:00:00 UTC.static ColumnReturns the number of milliseconds since 1970-01-01 00:00:00 UTC.static ColumnReturns the number of seconds since 1970-01-01 00:00:00 UTC.static ColumnReturns the current Unix timestamp (in seconds) as a long.static ColumnConverts time string in format yyyy-MM-dd HH:mm:ss to Unix timestamp (in seconds), using the default timezone and the default locale.static Columnunix_timestamp(Column s, String p) Converts time string with given pattern to Unix timestamp (in seconds).static Columnunwrap_udt(Column column) Unwrap UDT data type column into its underlying type.static ColumnConverts a string column to upper case.static Columnurl_decode(Column str) Decodes astrin 'application/x-www-form-urlencoded' format using a specific encoding scheme.static Columnurl_encode(Column str) Translates a string into 'application/x-www-form-urlencoded' format using a specific encoding scheme.static Columnuser()Returns the user name of current execution context.static Columnuuid()Returns an universally unique identifier (UUID) string.static ColumnReturns an universally unique identifier (UUID) string.static Columnvalidate_utf8(Column str) Returns the input value if it corresponds to a valid UTF-8 string, or emits a SparkIllegalArgumentException exception otherwise.static ColumnAggregate function: returns the population variance of the values in a group.static ColumnAggregate function: returns the population variance of the values in a group.static ColumnAggregate function: returns the unbiased variance of the values in a group.static ColumnAggregate function: returns the unbiased variance of the values in a group.static ColumnAggregate function: alias forvar_samp.static ColumnAggregate function: alias forvar_samp.static Columnvariant_get(Column v, String path, String targetType) Extracts a sub-variant fromvaccording topathstring, and then cast the sub-variant totargetType.static Columnvariant_get(Column v, Column path, String targetType) Extracts a sub-variant fromvaccording topathcolumn, and then cast the sub-variant totargetType.static Columnversion()Returns the Spark version.static ColumnReturns the day of the week for date/timestamp (0 = Monday, 1 = Tuesday, ..., 6 = Sunday).static Columnweekofyear(Column e) Extracts the week number as an integer from a given date/timestamp/string.static ColumnEvaluates a list of conditions and returns one of multiple possible result expressions.static Columnwidth_bucket(Column v, Column min, Column max, Column numBucket) Returns the bucket number into which the value of this expression would fall after being evaluated.static ColumnGenerates tumbling time windows given a timestamp specifying column.static ColumnBucketize rows into one or more time windows given a timestamp specifying column.static ColumnBucketize rows into one or more time windows given a timestamp specifying column.static Columnwindow_time(Column windowColumn) Extracts the event time from the window column.static ColumnReturns a string array of values within the nodes of xml that match the XPath expression.static Columnxpath_boolean(Column xml, Column path) Returns true if the XPath expression evaluates to true, or if a matching node is found.static Columnxpath_double(Column xml, Column path) Returns a double value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric.static Columnxpath_float(Column xml, Column path) Returns a float value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric.static ColumnReturns an integer value, or the value zero if no match is found, or a match is found but the value is non-numeric.static Columnxpath_long(Column xml, Column path) Returns a long integer value, or the value zero if no match is found, or a match is found but the value is non-numeric.static Columnxpath_number(Column xml, Column path) Returns a double value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric.static Columnxpath_short(Column xml, Column path) Returns a short integer value, or the value zero if no match is found, or a match is found but the value is non-numeric.static Columnxpath_string(Column xml, Column path) Returns the text contents of the first xml node that matches the XPath expression.static ColumnCalculates the hash code of given columns using the 64-bit variant of the xxHash algorithm, and returns the result as a long column.static ColumnCalculates the hash code of given columns using the 64-bit variant of the xxHash algorithm, and returns the result as a long column.static ColumnExtracts the year as an integer from a given date/timestamp/string.static Column(Java-specific) A transform for timestamps and dates to partition data into years.static Columnzeroifnull(Column col) Returns zero ifcolis null, orcolotherwise.static ColumnMerge two given arrays, element-wise, into a single array using a function.
- 
Constructor Details- 
functionspublic functions()
 
- 
- 
Method Details- 
countDistinctAggregate function: returns the number of distinct items in a group.An alias of count_distinct, and it is encouraged to usecount_distinctdirectly.- Parameters:
- expr- (undocumented)
- exprs- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
countDistinctAggregate function: returns the number of distinct items in a group.An alias of count_distinct, and it is encouraged to usecount_distinctdirectly.- Parameters:
- columnName- (undocumented)
- columnNames- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
count_distinctAggregate function: returns the number of distinct items in a group.- Parameters:
- expr- (undocumented)
- exprs- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.2.0
 
- 
grouping_idAggregate function: returns the level of grouping, equals to(grouping(c1) <<; (n-1)) + (grouping(c2) <<; (n-2)) + ... + grouping(cn)- Parameters:
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.0.0
- Note:
- The list of columns should match with grouping columns exactly, or empty (means all the grouping columns).
 
- 
grouping_idAggregate function: returns the level of grouping, equals to(grouping(c1) <<; (n-1)) + (grouping(c2) <<; (n-2)) + ... + grouping(cn)- Parameters:
- colName- (undocumented)
- colNames- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.0.0
- Note:
- The list of columns should match with grouping columns exactly.
 
- 
arrayCreates a new array column. The input columns must all have the same data type.- Parameters:
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
arrayCreates a new array column. The input columns must all have the same data type.- Parameters:
- colName- (undocumented)
- colNames- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
mapCreates a new map column. The input columns must be grouped as key-value pairs, e.g. (key1, value1, key2, value2, ...). The key columns must all have the same data type, and can't be null. The value columns must all have the same data type.- Parameters:
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.0
 
- 
named_structCreates a struct with the given field names and values.- Parameters:
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
coalesceReturns the first column that is not null, or null if all inputs are null.For example, coalesce(a, b, c)will return a if a is not null, or b if a is null and b is not null, or c if both a and b are null but c is not null.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
structCreates a new struct column. If the input column is a column in aDataFrame, or a derived column expression that is named (i.e. aliased), its name would be retained as the StructField's name, otherwise, the newly generated StructField's name would be auto generated ascolwith a suffixindex + 1, i.e. col1, col2, col3, ...- Parameters:
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
structCreates a new struct column that composes multiple input columns.- Parameters:
- colName- (undocumented)
- colNames- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
greatestReturns the greatest value of the list of values, skipping null values. This function takes at least 2 parameters. It will return null iff all parameters are null.- Parameters:
- exprs- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
greatestReturns the greatest value of the list of column names, skipping null values. This function takes at least 2 parameters. It will return null iff all parameters are null.- Parameters:
- columnName- (undocumented)
- columnNames- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
leastReturns the least value of the list of values, skipping null values. This function takes at least 2 parameters. It will return null iff all parameters are null.- Parameters:
- exprs- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
leastReturns the least value of the list of column names, skipping null values. This function takes at least 2 parameters. It will return null iff all parameters are null.- Parameters:
- columnName- (undocumented)
- columnNames- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
hashCalculates the hash code of given columns, and returns the result as an int column.- Parameters:
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.0.0
 
- 
xxhash64Calculates the hash code of given columns using the 64-bit variant of the xxHash algorithm, and returns the result as a long column. The hash computation uses an initial seed of 42.- Parameters:
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
reflectCalls a method with reflection.- Parameters:
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
java_methodCalls a method with reflection.- Parameters:
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
try_reflectThis is a special version ofreflectthat performs the same operation, but returns a NULL value instead of raising an error if the invoke method thrown exception.- Parameters:
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
stackSeparatescol1, ...,colkintonrows. Uses column names col0, col1, etc. by default unless specified otherwise.- Parameters:
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
concat_wsConcatenates multiple input string columns together into a single string column, using the given separator.- Parameters:
- sep- (undocumented)
- exprs- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
- Note:
- Input strings which are null are skipped.
 
- 
format_stringFormats the arguments in printf-style and returns the result as a string column.- Parameters:
- format- (undocumented)
- arguments- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
printfFormats the arguments in printf-style and returns the result as a string column.- Parameters:
- format- (undocumented)
- arguments- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
eltReturns then-th input, e.g., returnsinput2whennis 2. The function returns NULL if the index exceeds the length of the array andspark.sql.ansi.enabledis set to false. Ifspark.sql.ansi.enabledis set to true, it throws ArrayIndexOutOfBoundsException for invalid indices.- Parameters:
- inputs- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
concatConcatenates multiple input columns together into a single column. The function works with strings, binary and compatible array columns.- Parameters:
- exprs- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
- Note:
- Returns null if any of the input columns are null.
 
- 
json_tupleCreates a new row for a json column according to the given field names.- Parameters:
- json- (undocumented)
- fields- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
arrays_zipReturns a merged array of structs in which the N-th struct contains all N-th values of input arrays.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
map_concatReturns the union of all the given maps.- Parameters:
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
callUDFCall an user-defined function.- Parameters:
- udfName- (undocumented)
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
call_udfCall an user-defined function. Example:import org.apache.spark.sql._ val df = Seq(("id1", 1), ("id2", 4), ("id3", 5)).toDF("id", "value") val spark = df.sparkSession spark.udf.register("simpleUDF", (v: Int) => v * v) df.select($"id", call_udf("simpleUDF", $"value"))- Parameters:
- udfName- (undocumented)
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.2.0
 
- 
call_functionCall a SQL function.- Parameters:
- funcName- function name that follows the SQL identifier syntax (can be quoted, can be qualified)
- cols- the expression parameters of function
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
colReturns aColumnbased on the given column name.- Parameters:
- colName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
columnReturns aColumnbased on the given column name. Alias ofcol(java.lang.String).- Parameters:
- colName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
litCreates aColumnof literal value.The passed in object is returned directly if it is already a Column. If the object is a Scala Symbol, it is converted into aColumnalso. Otherwise, a newColumnis created to represent the literal value.- Parameters:
- literal- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
typedLitCreates aColumnof literal value.An alias of typedlit, and it is encouraged to usetypedlitdirectly.- Parameters:
- literal- (undocumented)
- evidence$1- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.2.0
 
- 
typedlitCreates aColumnof literal value.The passed in object is returned directly if it is already a Column. If the object is a Scala Symbol, it is converted into aColumnalso. Otherwise, a newColumnis created to represent the literal value. The difference between this function andlit(java.lang.Object)is that this function can handle parameterized scala types e.g.: List, Seq and Map.- Parameters:
- literal- (undocumented)
- evidence$2- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.2.0
- Note:
- typedlitwill call expensive Scala reflection APIs.- litis preferred if parameterized Scala types are not used.
 
- 
ascReturns a sort expression based on ascending order of the column.df.sort(asc("dept"), desc("age"))- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
asc_nulls_firstReturns a sort expression based on ascending order of the column, and null values return before non-null values.df.sort(asc_nulls_first("dept"), desc("age"))- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.1.0
 
- 
asc_nulls_lastReturns a sort expression based on ascending order of the column, and null values appear after non-null values.df.sort(asc_nulls_last("dept"), desc("age"))- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.1.0
 
- 
descReturns a sort expression based on the descending order of the column.df.sort(asc("dept"), desc("age"))- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
desc_nulls_firstReturns a sort expression based on the descending order of the column, and null values appear before non-null values.df.sort(asc("dept"), desc_nulls_first("age"))- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.1.0
 
- 
desc_nulls_lastReturns a sort expression based on the descending order of the column, and null values appear after non-null values.df.sort(asc("dept"), desc_nulls_last("age"))- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.1.0
 
- 
approxCountDistinctDeprecated.Use approx_count_distinct. Since 2.1.0.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
approxCountDistinctDeprecated.Use approx_count_distinct. Since 2.1.0.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
approxCountDistinctDeprecated.Use approx_count_distinct. Since 2.1.0.- Parameters:
- e- (undocumented)
- rsd- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
approxCountDistinctDeprecated.Use approx_count_distinct. Since 2.1.0.- Parameters:
- columnName- (undocumented)
- rsd- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
approx_count_distinctAggregate function: returns the approximate number of distinct items in a group.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.1.0
 
- 
approx_count_distinctAggregate function: returns the approximate number of distinct items in a group.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.1.0
 
- 
approx_count_distinctAggregate function: returns the approximate number of distinct items in a group.- Parameters:
- rsd- maximum relative standard deviation allowed (default = 0.05)
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.1.0
 
- 
approx_count_distinctAggregate function: returns the approximate number of distinct items in a group.- Parameters:
- rsd- maximum relative standard deviation allowed (default = 0.05)
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.1.0
 
- 
avgAggregate function: returns the average of the values in a group.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
avgAggregate function: returns the average of the values in a group.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
collect_listAggregate function: returns a list of objects with duplicates.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
- Note:
- The function is non-deterministic because the order of collected results depends on the order of the rows which may be non-deterministic after a shuffle.
 
- 
collect_listAggregate function: returns a list of objects with duplicates.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
- Note:
- The function is non-deterministic because the order of collected results depends on the order of the rows which may be non-deterministic after a shuffle.
 
- 
collect_setAggregate function: returns a set of objects with duplicate elements eliminated.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
- Note:
- The function is non-deterministic because the order of collected results depends on the order of the rows which may be non-deterministic after a shuffle.
 
- 
collect_setAggregate function: returns a set of objects with duplicate elements eliminated.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
- Note:
- The function is non-deterministic because the order of collected results depends on the order of the rows which may be non-deterministic after a shuffle.
 
- 
count_min_sketchReturns a count-min sketch of a column with the given esp, confidence and seed. The result is an array of bytes, which can be deserialized to aCountMinSketchbefore usage. Count-min sketch is a probabilistic data structure used for cardinality estimation using sub-linear space.- Parameters:
- e- (undocumented)
- eps- (undocumented)
- confidence- (undocumented)
- seed- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
count_min_sketchReturns a count-min sketch of a column with the given esp, confidence and seed. The result is an array of bytes, which can be deserialized to aCountMinSketchbefore usage. Count-min sketch is a probabilistic data structure used for cardinality estimation using sub-linear space.- Parameters:
- e- (undocumented)
- eps- (undocumented)
- confidence- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
corrAggregate function: returns the Pearson Correlation Coefficient for two columns.- Parameters:
- column1- (undocumented)
- column2- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
corrAggregate function: returns the Pearson Correlation Coefficient for two columns.- Parameters:
- columnName1- (undocumented)
- columnName2- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
countAggregate function: returns the number of items in a group.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
countAggregate function: returns the number of items in a group.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
countDistinctAggregate function: returns the number of distinct items in a group.An alias of count_distinct, and it is encouraged to usecount_distinctdirectly.- Parameters:
- expr- (undocumented)
- exprs- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
countDistinctpublic static Column countDistinct(String columnName, scala.collection.immutable.Seq<String> columnNames) Aggregate function: returns the number of distinct items in a group.An alias of count_distinct, and it is encouraged to usecount_distinctdirectly.- Parameters:
- columnName- (undocumented)
- columnNames- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
count_distinctAggregate function: returns the number of distinct items in a group.- Parameters:
- expr- (undocumented)
- exprs- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.2.0
 
- 
covar_popAggregate function: returns the population covariance for two columns.- Parameters:
- column1- (undocumented)
- column2- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.0.0
 
- 
covar_popAggregate function: returns the population covariance for two columns.- Parameters:
- columnName1- (undocumented)
- columnName2- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.0.0
 
- 
covar_sampAggregate function: returns the sample covariance for two columns.- Parameters:
- column1- (undocumented)
- column2- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.0.0
 
- 
covar_sampAggregate function: returns the sample covariance for two columns.- Parameters:
- columnName1- (undocumented)
- columnName2- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.0.0
 
- 
firstAggregate function: returns the first value in a group.The function by default returns the first values it sees. It will return the first non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned. - Parameters:
- e- (undocumented)
- ignoreNulls- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.0.0
- Note:
- The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
 
- 
firstAggregate function: returns the first value of a column in a group.The function by default returns the first values it sees. It will return the first non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned. - Parameters:
- columnName- (undocumented)
- ignoreNulls- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.0.0
- Note:
- The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
 
- 
firstAggregate function: returns the first value in a group.The function by default returns the first values it sees. It will return the first non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned. - Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
- Note:
- The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
 
- 
firstAggregate function: returns the first value of a column in a group.The function by default returns the first values it sees. It will return the first non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned. - Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
- Note:
- The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
 
- 
first_valueAggregate function: returns the first value in a group.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
- Note:
- The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
 
- 
first_valueAggregate function: returns the first value in a group.The function by default returns the first values it sees. It will return the first non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned. - Parameters:
- e- (undocumented)
- ignoreNulls- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
- Note:
- The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
 
- 
groupingAggregate function: indicates whether a specified column in a GROUP BY list is aggregated or not, returns 1 for aggregated or 0 for not aggregated in the result set.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.0.0
 
- 
groupingAggregate function: indicates whether a specified column in a GROUP BY list is aggregated or not, returns 1 for aggregated or 0 for not aggregated in the result set.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.0.0
 
- 
grouping_idAggregate function: returns the level of grouping, equals to(grouping(c1) <<; (n-1)) + (grouping(c2) <<; (n-2)) + ... + grouping(cn)- Parameters:
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.0.0
- Note:
- The list of columns should match with grouping columns exactly, or empty (means all the grouping columns).
 
- 
grouping_idAggregate function: returns the level of grouping, equals to(grouping(c1) <<; (n-1)) + (grouping(c2) <<; (n-2)) + ... + grouping(cn)- Parameters:
- colName- (undocumented)
- colNames- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.0.0
- Note:
- The list of columns should match with grouping columns exactly.
 
- 
hll_sketch_aggAggregate function: returns the updatable binary representation of the Datasketches HllSketch configured with lgConfigK arg.- Parameters:
- e- (undocumented)
- lgConfigK- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
hll_sketch_aggAggregate function: returns the updatable binary representation of the Datasketches HllSketch configured with lgConfigK arg.- Parameters:
- e- (undocumented)
- lgConfigK- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
hll_sketch_aggAggregate function: returns the updatable binary representation of the Datasketches HllSketch configured with lgConfigK arg.- Parameters:
- columnName- (undocumented)
- lgConfigK- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
hll_sketch_aggAggregate function: returns the updatable binary representation of the Datasketches HllSketch configured with default lgConfigK value.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
hll_sketch_aggAggregate function: returns the updatable binary representation of the Datasketches HllSketch configured with default lgConfigK value.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
hll_union_aggAggregate function: returns the updatable binary representation of the Datasketches HllSketch, generated by merging previously created Datasketches HllSketch instances via a Datasketches Union instance. Throws an exception if sketches have different lgConfigK values and allowDifferentLgConfigK is set to false.- Parameters:
- e- (undocumented)
- allowDifferentLgConfigK- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
hll_union_aggAggregate function: returns the updatable binary representation of the Datasketches HllSketch, generated by merging previously created Datasketches HllSketch instances via a Datasketches Union instance. Throws an exception if sketches have different lgConfigK values and allowDifferentLgConfigK is set to false.- Parameters:
- e- (undocumented)
- allowDifferentLgConfigK- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
hll_union_aggAggregate function: returns the updatable binary representation of the Datasketches HllSketch, generated by merging previously created Datasketches HllSketch instances via a Datasketches Union instance. Throws an exception if sketches have different lgConfigK values and allowDifferentLgConfigK is set to false.- Parameters:
- columnName- (undocumented)
- allowDifferentLgConfigK- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
hll_union_aggAggregate function: returns the updatable binary representation of the Datasketches HllSketch, generated by merging previously created Datasketches HllSketch instances via a Datasketches Union instance. Throws an exception if sketches have different lgConfigK values.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
hll_union_aggAggregate function: returns the updatable binary representation of the Datasketches HllSketch, generated by merging previously created Datasketches HllSketch instances via a Datasketches Union instance. Throws an exception if sketches have different lgConfigK values.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
kurtosisAggregate function: returns the kurtosis of the values in a group.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
kurtosisAggregate function: returns the kurtosis of the values in a group.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
lastAggregate function: returns the last value in a group.The function by default returns the last values it sees. It will return the last non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned. - Parameters:
- e- (undocumented)
- ignoreNulls- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.0.0
- Note:
- The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
 
- 
lastAggregate function: returns the last value of the column in a group.The function by default returns the last values it sees. It will return the last non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned. - Parameters:
- columnName- (undocumented)
- ignoreNulls- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.0.0
- Note:
- The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
 
- 
lastAggregate function: returns the last value in a group.The function by default returns the last values it sees. It will return the last non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned. - Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
- Note:
- The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
 
- 
lastAggregate function: returns the last value of the column in a group.The function by default returns the last values it sees. It will return the last non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned. - Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
- Note:
- The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
 
- 
last_valueAggregate function: returns the last value in a group.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
- Note:
- The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
 
- 
last_valueAggregate function: returns the last value in a group.The function by default returns the last values it sees. It will return the last non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned. - Parameters:
- e- (undocumented)
- ignoreNulls- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
- Note:
- The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
 
- 
make_timeCreate time from hour, minute and second fields. For invalid inputs it will throw an error.- Parameters:
- hour- the hour to represent, from 0 to 23
- minute- the minute to represent, from 0 to 59
- second- the second to represent, from 0 to 59.999999
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
modeAggregate function: returns the most frequent value in a group.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.4.0
 
- 
modeAggregate function: returns the most frequent value in a group.When multiple values have the same greatest frequency then either any of values is returned if deterministic is false or is not defined, or the lowest value is returned if deterministic is true. - Parameters:
- e- (undocumented)
- deterministic- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
maxAggregate function: returns the maximum value of the expression in a group.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
maxAggregate function: returns the maximum value of the column in a group.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
max_byAggregate function: returns the value associated with the maximum value of ord.- Parameters:
- e- (undocumented)
- ord- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.3.0
- Note:
- The function is non-deterministic so the output order can be different for those associated
   the same values of e.
 
- 
meanAggregate function: returns the average of the values in a group. Alias for avg.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
meanAggregate function: returns the average of the values in a group. Alias for avg.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
medianAggregate function: returns the median of the values in a group.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.4.0
 
- 
minAggregate function: returns the minimum value of the expression in a group.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
minAggregate function: returns the minimum value of the column in a group.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
min_byAggregate function: returns the value associated with the minimum value of ord.- Parameters:
- e- (undocumented)
- ord- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.3.0
- Note:
- The function is non-deterministic so the output order can be different for those associated
   the same values of e.
 
- 
percentileAggregate function: returns the exact percentile(s) of numeric columnexprat the given percentage(s) with value range in [0.0, 1.0].- Parameters:
- e- (undocumented)
- percentage- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
percentileAggregate function: returns the exact percentile(s) of numeric columnexprat the given percentage(s) with value range in [0.0, 1.0].- Parameters:
- e- (undocumented)
- percentage- (undocumented)
- frequency- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
percentile_approxAggregate function: returns the approximatepercentileof the numeric columncolwhich is the smallest value in the orderedcolvalues (sorted from least to greatest) such that no more thanpercentageofcolvalues is less than the value or equal to that value.If percentage is an array, each value must be between 0.0 and 1.0. If it is a single floating point value, it must be between 0.0 and 1.0. The accuracy parameter is a positive numeric literal which controls approximation accuracy at the cost of memory. Higher value of accuracy yields better accuracy, 1.0/accuracy is the relative error of the approximation. - Parameters:
- e- (undocumented)
- percentage- (undocumented)
- accuracy- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.1.0
 
- 
approx_percentileAggregate function: returns the approximatepercentileof the numeric columncolwhich is the smallest value in the orderedcolvalues (sorted from least to greatest) such that no more thanpercentageofcolvalues is less than the value or equal to that value.If percentage is an array, each value must be between 0.0 and 1.0. If it is a single floating point value, it must be between 0.0 and 1.0. The accuracy parameter is a positive numeric literal which controls approximation accuracy at the cost of memory. Higher value of accuracy yields better accuracy, 1.0/accuracy is the relative error of the approximation. - Parameters:
- e- (undocumented)
- percentage- (undocumented)
- accuracy- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
productAggregate function: returns the product of all numerical elements in a group.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.2.0
 
- 
skewnessAggregate function: returns the skewness of the values in a group.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
skewnessAggregate function: returns the skewness of the values in a group.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
stdAggregate function: alias forstddev_samp.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
stddevAggregate function: alias forstddev_samp.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
stddevAggregate function: alias forstddev_samp.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
stddev_sampAggregate function: returns the sample standard deviation of the expression in a group.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
stddev_sampAggregate function: returns the sample standard deviation of the expression in a group.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
stddev_popAggregate function: returns the population standard deviation of the expression in a group.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
stddev_popAggregate function: returns the population standard deviation of the expression in a group.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
sumAggregate function: returns the sum of all values in the expression.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
sumAggregate function: returns the sum of all values in the given column.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
sumDistinctDeprecated.Use sum_distinct. Since 3.2.0.Aggregate function: returns the sum of distinct values in the expression.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
sumDistinctDeprecated.Use sum_distinct. Since 3.2.0.Aggregate function: returns the sum of distinct values in the expression.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
sum_distinctAggregate function: returns the sum of distinct values in the expression.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.2.0
 
- 
theta_intersection_aggAggregate function: returns the compact binary representation of the Datasketches ThetaSketch, generated by intersecting the Datasketches ThetaSketch instances in the input column via a Datasketches Intersection instance.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
theta_intersection_aggAggregate function: returns the compact binary representation of the Datasketches ThetaSketch, generated by intersecting the Datasketches ThetaSketch instances in the input volumn via a Datasketches Intersection instance.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
theta_sketch_aggAggregate function: returns the compact binary representation of the Datasketches ThetaSketch built with the values in the input column and configured with thelgNomEntriesnominal entries.- Parameters:
- e- (undocumented)
- lgNomEntries- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
theta_sketch_aggAggregate function: returns the compact binary representation of the Datasketches ThetaSketch built with the values in the input column and configured with thelgNomEntriesnominal entries.- Parameters:
- e- (undocumented)
- lgNomEntries- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
theta_sketch_aggAggregate function: returns the compact binary representation of the Datasketches ThetaSketch built with the values in the input column and configured with thelgNomEntriesnominal entries.- Parameters:
- columnName- (undocumented)
- lgNomEntries- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
theta_sketch_aggAggregate function: returns the compact binary representation of the Datasketches ThetaSketch built with the values in the input column and configured with the default value of 12 forlgNomEntries.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
theta_sketch_aggAggregate function: returns the compact binary representation of the Datasketches ThetaSketch built with the values in the input column and configured with the default value of 12 forlgNomEntries.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
theta_union_aggAggregate function: returns the compact binary representation of the Datasketches ThetaSketch, generated by the union of Datasketches ThetaSketch instances in the input column via a Datasketches Union instance. It allows the configuration oflgNomEntrieslog nominal entries for the union buffer.- Parameters:
- e- (undocumented)
- lgNomEntries- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
theta_union_aggAggregate function: returns the compact binary representation of the Datasketches ThetaSketch, generated by the union of Datasketches ThetaSketch instances in the input column via a Datasketches Union instance. It allows the configuration oflgNomEntrieslog nominal entries for the union buffer.- Parameters:
- e- (undocumented)
- lgNomEntries- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
theta_union_aggAggregate function: returns the compact binary representation of the Datasketches ThetaSketch, generated by the union of Datasketches ThetaSketch instances in the input column via a Datasketches Union instance. It allows the configuration oflgNomEntrieslog nominal entries for the union buffer.- Parameters:
- columnName- (undocumented)
- lgNomEntries- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
theta_union_aggAggregate function: returns the compact binary representation of the Datasketches ThetaSketch, generated by the union of Datasketches ThetaSketch instances in the input column via a Datasketches Union instance. It is configured with the default value of 12 forlgNomEntries.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
theta_union_aggAggregate function: returns the compact binary representation of the Datasketches ThetaSketch, generated by the union of Datasketches ThetaSketch instances in the input column via a Datasketches Union instance. It is configured with the default value of 12 forlgNomEntries.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
listaggAggregate function: returns the concatenation of non-null input values.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
listaggAggregate function: returns the concatenation of non-null input values, separated by the delimiter.- Parameters:
- e- (undocumented)
- delimiter- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
listagg_distinctAggregate function: returns the concatenation of distinct non-null input values.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
listagg_distinctAggregate function: returns the concatenation of distinct non-null input values, separated by the delimiter.- Parameters:
- e- (undocumented)
- delimiter- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
string_aggAggregate function: returns the concatenation of non-null input values. Alias forlistagg.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
string_aggAggregate function: returns the concatenation of non-null input values, separated by the delimiter. Alias forlistagg.- Parameters:
- e- (undocumented)
- delimiter- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
string_agg_distinctAggregate function: returns the concatenation of distinct non-null input values. Alias forlistagg.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
string_agg_distinctAggregate function: returns the concatenation of distinct non-null input values, separated by the delimiter. Alias forlistagg.- Parameters:
- e- (undocumented)
- delimiter- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
varianceAggregate function: alias forvar_samp.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
varianceAggregate function: alias forvar_samp.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
var_sampAggregate function: returns the unbiased variance of the values in a group.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
var_sampAggregate function: returns the unbiased variance of the values in a group.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
var_popAggregate function: returns the population variance of the values in a group.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
var_popAggregate function: returns the population variance of the values in a group.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
regr_avgxAggregate function: returns the average of the independent variable for non-null pairs in a group, whereyis the dependent variable andxis the independent variable.- Parameters:
- y- (undocumented)
- x- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
regr_avgyAggregate function: returns the average of the independent variable for non-null pairs in a group, whereyis the dependent variable andxis the independent variable.- Parameters:
- y- (undocumented)
- x- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
regr_countAggregate function: returns the number of non-null number pairs in a group, whereyis the dependent variable andxis the independent variable.- Parameters:
- y- (undocumented)
- x- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
regr_interceptAggregate function: returns the intercept of the univariate linear regression line for non-null pairs in a group, whereyis the dependent variable andxis the independent variable.- Parameters:
- y- (undocumented)
- x- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
regr_r2Aggregate function: returns the coefficient of determination for non-null pairs in a group, whereyis the dependent variable andxis the independent variable.- Parameters:
- y- (undocumented)
- x- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
regr_slopeAggregate function: returns the slope of the linear regression line for non-null pairs in a group, whereyis the dependent variable andxis the independent variable.- Parameters:
- y- (undocumented)
- x- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
regr_sxxAggregate function: returns REGR_COUNT(y, x) * VAR_POP(x) for non-null pairs in a group, whereyis the dependent variable andxis the independent variable.- Parameters:
- y- (undocumented)
- x- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
regr_sxyAggregate function: returns REGR_COUNT(y, x) * COVAR_POP(y, x) for non-null pairs in a group, whereyis the dependent variable andxis the independent variable.- Parameters:
- y- (undocumented)
- x- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
regr_syyAggregate function: returns REGR_COUNT(y, x) * VAR_POP(y) for non-null pairs in a group, whereyis the dependent variable andxis the independent variable.- Parameters:
- y- (undocumented)
- x- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
any_valueAggregate function: returns some value ofefor a group of rows.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
any_valueAggregate function: returns some value ofefor a group of rows. IfisIgnoreNullis true, returns only non-null values.- Parameters:
- e- (undocumented)
- ignoreNulls- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
count_ifAggregate function: returns the number ofTRUEvalues for the expression.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
current_timeReturns the current time at the start of query evaluation. Note that the result will contain 6 fractional digits of seconds.- Returns:
- A time.
- Since:
- 4.1.0
 
- 
current_timeReturns the current time at the start of query evaluation.- Parameters:
- precision- An integer literal in the range [0..6], indicating how many fractional digits of seconds to include in the result.
- Returns:
- A time.
- Since:
- 4.1.0
 
- 
histogram_numericAggregate function: computes a histogram on numeric 'expr' using nb bins. The return value is an array of (x,y) pairs representing the centers of the histogram's bins. As the value of 'nb' is increased, the histogram approximation gets finer-grained, but may yield artifacts around outliers. In practice, 20-40 histogram bins appear to work well, with more bins being required for skewed or smaller datasets. Note that this function creates a histogram with non-uniform bin widths. It offers no guarantees in terms of the mean-squared-error of the histogram, but in practice is comparable to the histograms produced by the R/S-Plus statistical computing packages. Note: the output type of the 'x' field in the return value is propagated from the input value consumed in the aggregate function.- Parameters:
- e- (undocumented)
- nBins- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
everyAggregate function: returns true if all values ofeare true.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
bool_andAggregate function: returns true if all values ofeare true.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
someAggregate function: returns true if at least one value ofeis true.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
anyAggregate function: returns true if at least one value ofeis true.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
bool_orAggregate function: returns true if at least one value ofeis true.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
bit_andAggregate function: returns the bitwise AND of all non-null input values, or null if none.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
bit_orAggregate function: returns the bitwise OR of all non-null input values, or null if none.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
bit_xorAggregate function: returns the bitwise XOR of all non-null input values, or null if none.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
cume_distWindow function: returns the cumulative distribution of values within a window partition, i.e. the fraction of rows that are below the current row.N = total number of rows in the partition cumeDist(x) = number of values before (and including) x / N- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
dense_rankWindow function: returns the rank of rows within a window partition, without any gaps.The difference between rank and dense_rank is that denseRank leaves no gaps in ranking sequence when there are ties. That is, if you were ranking a competition using dense_rank and had three people tie for second place, you would say that all three were in second place and that the next person came in third. Rank would give me sequential numbers, making the person that came in third place (after the ties) would register as coming in fifth. This is equivalent to the DENSE_RANK function in SQL. - Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
lagWindow function: returns the value that isoffsetrows before the current row, andnullif there is less thanoffsetrows before the current row. For example, anoffsetof one will return the previous row at any given point in the window partition.This is equivalent to the LAG function in SQL. - Parameters:
- e- (undocumented)
- offset- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
lagWindow function: returns the value that isoffsetrows before the current row, andnullif there is less thanoffsetrows before the current row. For example, anoffsetof one will return the previous row at any given point in the window partition.This is equivalent to the LAG function in SQL. - Parameters:
- columnName- (undocumented)
- offset- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
lagWindow function: returns the value that isoffsetrows before the current row, anddefaultValueif there is less thanoffsetrows before the current row. For example, anoffsetof one will return the previous row at any given point in the window partition.This is equivalent to the LAG function in SQL. - Parameters:
- columnName- (undocumented)
- offset- (undocumented)
- defaultValue- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
lagWindow function: returns the value that isoffsetrows before the current row, anddefaultValueif there is less thanoffsetrows before the current row. For example, anoffsetof one will return the previous row at any given point in the window partition.This is equivalent to the LAG function in SQL. - Parameters:
- e- (undocumented)
- offset- (undocumented)
- defaultValue- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
lagWindow function: returns the value that isoffsetrows before the current row, anddefaultValueif there is less thanoffsetrows before the current row.ignoreNullsdetermines whether null values of row are included in or eliminated from the calculation. For example, anoffsetof one will return the previous row at any given point in the window partition.This is equivalent to the LAG function in SQL. - Parameters:
- e- (undocumented)
- offset- (undocumented)
- defaultValue- (undocumented)
- ignoreNulls- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.2.0
 
- 
leadWindow function: returns the value that isoffsetrows after the current row, andnullif there is less thanoffsetrows after the current row. For example, anoffsetof one will return the next row at any given point in the window partition.This is equivalent to the LEAD function in SQL. - Parameters:
- columnName- (undocumented)
- offset- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
leadWindow function: returns the value that isoffsetrows after the current row, andnullif there is less thanoffsetrows after the current row. For example, anoffsetof one will return the next row at any given point in the window partition.This is equivalent to the LEAD function in SQL. - Parameters:
- e- (undocumented)
- offset- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
leadWindow function: returns the value that isoffsetrows after the current row, anddefaultValueif there is less thanoffsetrows after the current row. For example, anoffsetof one will return the next row at any given point in the window partition.This is equivalent to the LEAD function in SQL. - Parameters:
- columnName- (undocumented)
- offset- (undocumented)
- defaultValue- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
leadWindow function: returns the value that isoffsetrows after the current row, anddefaultValueif there is less thanoffsetrows after the current row. For example, anoffsetof one will return the next row at any given point in the window partition.This is equivalent to the LEAD function in SQL. - Parameters:
- e- (undocumented)
- offset- (undocumented)
- defaultValue- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
leadWindow function: returns the value that isoffsetrows after the current row, anddefaultValueif there is less thanoffsetrows after the current row.ignoreNullsdetermines whether null values of row are included in or eliminated from the calculation. The default value ofignoreNullsis false. For example, anoffsetof one will return the next row at any given point in the window partition.This is equivalent to the LEAD function in SQL. - Parameters:
- e- (undocumented)
- offset- (undocumented)
- defaultValue- (undocumented)
- ignoreNulls- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.2.0
 
- 
nth_valueWindow function: returns the value that is theoffsetth row of the window frame (counting from 1), andnullif the size of window frame is less thanoffsetrows.It will return the offsetth non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned.This is equivalent to the nth_value function in SQL. - Parameters:
- e- (undocumented)
- offset- (undocumented)
- ignoreNulls- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.1.0
 
- 
nth_valueWindow function: returns the value that is theoffsetth row of the window frame (counting from 1), andnullif the size of window frame is less thanoffsetrows.This is equivalent to the nth_value function in SQL. - Parameters:
- e- (undocumented)
- offset- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.1.0
 
- 
ntileWindow function: returns the ntile group id (from 1 toninclusive) in an ordered window partition. For example, ifnis 4, the first quarter of the rows will get value 1, the second quarter will get 2, the third quarter will get 3, and the last quarter will get 4.This is equivalent to the NTILE function in SQL. - Parameters:
- n- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
percent_rankWindow function: returns the relative rank (i.e. percentile) of rows within a window partition.This is computed by: (rank of row in its partition - 1) / (number of rows in the partition - 1)This is equivalent to the PERCENT_RANK function in SQL. - Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
rankWindow function: returns the rank of rows within a window partition.The difference between rank and dense_rank is that dense_rank leaves no gaps in ranking sequence when there are ties. That is, if you were ranking a competition using dense_rank and had three people tie for second place, you would say that all three were in second place and that the next person came in third. Rank would give me sequential numbers, making the person that came in third place (after the ties) would register as coming in fifth. This is equivalent to the RANK function in SQL. - Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
row_numberWindow function: returns a sequential number starting at 1 within a window partition.- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
arrayCreates a new array column. The input columns must all have the same data type.- Parameters:
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
arrayCreates a new array column. The input columns must all have the same data type.- Parameters:
- colName- (undocumented)
- colNames- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
mapCreates a new map column. The input columns must be grouped as key-value pairs, e.g. (key1, value1, key2, value2, ...). The key columns must all have the same data type, and can't be null. The value columns must all have the same data type.- Parameters:
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.0
 
- 
named_structCreates a struct with the given field names and values.- Parameters:
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
map_from_arraysCreates a new map column. The array in the first column is used for keys. The array in the second column is used for values. All elements in the array for key should not be null.- Parameters:
- keys- (undocumented)
- values- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4
 
- 
str_to_mapCreates a map after splitting the text into key/value pairs using delimiters. BothpairDelimandkeyValueDelimare treated as regular expressions.- Parameters:
- text- (undocumented)
- pairDelim- (undocumented)
- keyValueDelim- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
str_to_mapCreates a map after splitting the text into key/value pairs using delimiters. ThepairDelimis treated as regular expressions.- Parameters:
- text- (undocumented)
- pairDelim- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
str_to_mapCreates a map after splitting the text into key/value pairs using delimiters.- Parameters:
- text- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
broadcastMarks a DataFrame as small enough for use in broadcast joins.The following example marks the right DataFrame for broadcast hash join using joinKey.// left and right are DataFrames left.join(broadcast(right), "joinKey")- Parameters:
- df- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
coalesceReturns the first column that is not null, or null if all inputs are null.For example, coalesce(a, b, c)will return a if a is not null, or b if a is null and b is not null, or c if both a and b are null but c is not null.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
input_file_nameCreates a string column for the file name of the current Spark task.- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
isnanReturn true iff the column is NaN.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
isnullReturn true iff the column is null.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
monotonicallyIncreasingIdDeprecated.Use monotonically_increasing_id(). Since 2.0.0.A column expression that generates monotonically increasing 64-bit integers.The generated ID is guaranteed to be monotonically increasing and unique, but not consecutive. The current implementation puts the partition ID in the upper 31 bits, and the record number within each partition in the lower 33 bits. The assumption is that the data frame has less than 1 billion partitions, and each partition has less than 8 billion records. As an example, consider a DataFramewith two partitions, each with 3 records. This expression would return the following IDs:0, 1, 2, 8589934592 (1L << 33), 8589934593, 8589934594.- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
monotonically_increasing_idA column expression that generates monotonically increasing 64-bit integers.The generated ID is guaranteed to be monotonically increasing and unique, but not consecutive. The current implementation puts the partition ID in the upper 31 bits, and the record number within each partition in the lower 33 bits. The assumption is that the data frame has less than 1 billion partitions, and each partition has less than 8 billion records. As an example, consider a DataFramewith two partitions, each with 3 records. This expression would return the following IDs:0, 1, 2, 8589934592 (1L << 33), 8589934593, 8589934594.- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
nanvlReturns col1 if it is not NaN, or col2 if col1 is NaN.Both inputs should be floating point columns (DoubleType or FloatType). - Parameters:
- col1- (undocumented)
- col2- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
negateUnary minus, i.e. negate the expression.// Select the amount column and negates all values. // Scala: df.select( -df("amount") ) // Java: df.select( negate(df.col("amount")) );- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
notInversion of boolean expression, i.e. NOT.// Scala: select rows that are not active (isActive === false) df.filter( !df("isActive") ) // Java: df.filter( not(df.col("isActive")) );- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
randGenerate a random column with independent and identically distributed (i.i.d.) samples uniformly distributed in [0.0, 1.0).- Parameters:
- seed- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
- Note:
- The function is non-deterministic in general case.
 
- 
randGenerate a random column with independent and identically distributed (i.i.d.) samples uniformly distributed in [0.0, 1.0).- Returns:
- (undocumented)
- Since:
- 1.4.0
- Note:
- The function is non-deterministic in general case.
 
- 
randnGenerate a column with independent and identically distributed (i.i.d.) samples from the standard normal distribution.- Parameters:
- seed- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
- Note:
- The function is non-deterministic in general case.
 
- 
randnGenerate a column with independent and identically distributed (i.i.d.) samples from the standard normal distribution.- Returns:
- (undocumented)
- Since:
- 1.4.0
- Note:
- The function is non-deterministic in general case.
 
- 
randstrReturns a string of the specified length whose characters are chosen uniformly at random from the following pool of characters: 0-9, a-z, A-Z. The string length must be a constant two-byte or four-byte integer (SMALLINT or INT, respectively).- Parameters:
- length- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
randstrReturns a string of the specified length whose characters are chosen uniformly at random from the following pool of characters: 0-9, a-z, A-Z, with the chosen random seed. The string length must be a constant two-byte or four-byte integer (SMALLINT or INT, respectively).- Parameters:
- length- (undocumented)
- seed- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
spark_partition_idPartition ID.- Returns:
- (undocumented)
- Since:
- 1.6.0
- Note:
- This is non-deterministic because it depends on data partitioning and task scheduling.
 
- 
sqrtComputes the square root of the specified float value.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
sqrtComputes the square root of the specified float value.- Parameters:
- colName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
try_addReturns the sum ofleftandrightand the result is null on overflow. The acceptable input types are the same with the+operator.- Parameters:
- left- (undocumented)
- right- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
try_avgReturns the mean calculated from values of a group and the result is null on overflow.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
try_divideReturnsdividend/divisor. It always performs floating point division. Its result is always null ifdivisoris 0.- Parameters:
- left- (undocumented)
- right- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
try_modReturns the remainder ofdividend/divisor. Its result is always null ifdivisoris 0.- Parameters:
- left- (undocumented)
- right- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
try_multiplyReturnsleft*rightand the result is null on overflow. The acceptable input types are the same with the*operator.- Parameters:
- left- (undocumented)
- right- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
try_subtractReturnsleft-rightand the result is null on overflow. The acceptable input types are the same with the-operator.- Parameters:
- left- (undocumented)
- right- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
try_sumReturns the sum calculated from values of a group and the result is null on overflow.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
structCreates a new struct column. If the input column is a column in aDataFrame, or a derived column expression that is named (i.e. aliased), its name would be retained as the StructField's name, otherwise, the newly generated StructField's name would be auto generated ascolwith a suffixindex + 1, i.e. col1, col2, col3, ...- Parameters:
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
structCreates a new struct column that composes multiple input columns.- Parameters:
- colName- (undocumented)
- colNames- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
whenEvaluates a list of conditions and returns one of multiple possible result expressions. If otherwise is not defined at the end, null is returned for unmatched conditions.// Example: encoding gender string column into integer. // Scala: people.select(when(people("gender") === "male", 0) .when(people("gender") === "female", 1) .otherwise(2)) // Java: people.select(when(col("gender").equalTo("male"), 0) .when(col("gender").equalTo("female"), 1) .otherwise(2))- Parameters:
- condition- (undocumented)
- value- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
bitwiseNOTDeprecated.Use bitwise_not. Since 3.2.0.Computes bitwise NOT (~) of a number.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
bitwise_notComputes bitwise NOT (~) of a number.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.2.0
 
- 
bit_countReturns the number of bits that are set in the argument expr as an unsigned 64-bit integer, or NULL if the argument is NULL.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
bit_getReturns the value of the bit (0 or 1) at the specified position. The positions are numbered from right to left, starting at zero. The position argument cannot be negative.- Parameters:
- e- (undocumented)
- pos- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
getbitReturns the value of the bit (0 or 1) at the specified position. The positions are numbered from right to left, starting at zero. The position argument cannot be negative.- Parameters:
- e- (undocumented)
- pos- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
exprParses the expression string into the column that it represents, similar toDataset.selectExpr(java.lang.String...).// get the number of words of each length df.groupBy(expr("length(word)")).count()- Parameters:
- expr- (undocumented)
- Returns:
- (undocumented)
 
- 
absComputes the absolute value of a numeric value.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
acos- Parameters:
- e- (undocumented)
- Returns:
- inverse cosine of ein radians, as if computed byjava.lang.Math.acos
- Since:
- 1.4.0
 
- 
acos- Parameters:
- columnName- (undocumented)
- Returns:
- inverse cosine of columnName, as if computed byjava.lang.Math.acos
- Since:
- 1.4.0
 
- 
acosh- Parameters:
- e- (undocumented)
- Returns:
- inverse hyperbolic cosine of e
- Since:
- 3.1.0
 
- 
acosh- Parameters:
- columnName- (undocumented)
- Returns:
- inverse hyperbolic cosine of columnName
- Since:
- 3.1.0
 
- 
asin- Parameters:
- e- (undocumented)
- Returns:
- inverse sine of ein radians, as if computed byjava.lang.Math.asin
- Since:
- 1.4.0
 
- 
asin- Parameters:
- columnName- (undocumented)
- Returns:
- inverse sine of columnName, as if computed byjava.lang.Math.asin
- Since:
- 1.4.0
 
- 
asinh- Parameters:
- e- (undocumented)
- Returns:
- inverse hyperbolic sine of e
- Since:
- 3.1.0
 
- 
asinh- Parameters:
- columnName- (undocumented)
- Returns:
- inverse hyperbolic sine of columnName
- Since:
- 3.1.0
 
- 
atan- Parameters:
- e- (undocumented)
- Returns:
- inverse tangent of eas if computed byjava.lang.Math.atan
- Since:
- 1.4.0
 
- 
atan- Parameters:
- columnName- (undocumented)
- Returns:
- inverse tangent of columnName, as if computed byjava.lang.Math.atan
- Since:
- 1.4.0
 
- 
atan2- Parameters:
- y- coordinate on y-axis
- x- coordinate on x-axis
- Returns:
- the theta component of the point (r, theta) in polar coordinates that
   corresponds to the point (x, y) in Cartesian coordinates, as if computed by
   java.lang.Math.atan2
- Since:
- 1.4.0
 
- 
atan2- Parameters:
- y- coordinate on y-axis
- xName- coordinate on x-axis
- Returns:
- the theta component of the point (r, theta) in polar coordinates that
   corresponds to the point (x, y) in Cartesian coordinates, as if computed by
   java.lang.Math.atan2
- Since:
- 1.4.0
 
- 
atan2- Parameters:
- yName- coordinate on y-axis
- x- coordinate on x-axis
- Returns:
- the theta component of the point (r, theta) in polar coordinates that
   corresponds to the point (x, y) in Cartesian coordinates, as if computed by
   java.lang.Math.atan2
- Since:
- 1.4.0
 
- 
atan2- Parameters:
- yName- coordinate on y-axis
- xName- coordinate on x-axis
- Returns:
- the theta component of the point (r, theta) in polar coordinates that
   corresponds to the point (x, y) in Cartesian coordinates, as if computed by
   java.lang.Math.atan2
- Since:
- 1.4.0
 
- 
atan2- Parameters:
- y- coordinate on y-axis
- xValue- coordinate on x-axis
- Returns:
- the theta component of the point (r, theta) in polar coordinates that
   corresponds to the point (x, y) in Cartesian coordinates, as if computed by
   java.lang.Math.atan2
- Since:
- 1.4.0
 
- 
atan2- Parameters:
- yName- coordinate on y-axis
- xValue- coordinate on x-axis
- Returns:
- the theta component of the point (r, theta) in polar coordinates that
   corresponds to the point (x, y) in Cartesian coordinates, as if computed by
   java.lang.Math.atan2
- Since:
- 1.4.0
 
- 
atan2- Parameters:
- yValue- coordinate on y-axis
- x- coordinate on x-axis
- Returns:
- the theta component of the point (r, theta) in polar coordinates that
   corresponds to the point (x, y) in Cartesian coordinates, as if computed by
   java.lang.Math.atan2
- Since:
- 1.4.0
 
- 
atan2- Parameters:
- yValue- coordinate on y-axis
- xName- coordinate on x-axis
- Returns:
- the theta component of the point (r, theta) in polar coordinates that
   corresponds to the point (x, y) in Cartesian coordinates, as if computed by
   java.lang.Math.atan2
- Since:
- 1.4.0
 
- 
atanh- Parameters:
- e- (undocumented)
- Returns:
- inverse hyperbolic tangent of e
- Since:
- 3.1.0
 
- 
atanh- Parameters:
- columnName- (undocumented)
- Returns:
- inverse hyperbolic tangent of columnName
- Since:
- 3.1.0
 
- 
binAn expression that returns the string representation of the binary value of the given long column. For example, bin("12") returns "1100".- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
binAn expression that returns the string representation of the binary value of the given long column. For example, bin("12") returns "1100".- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
cbrtComputes the cube-root of the given value.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
cbrtComputes the cube-root of the given column.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
ceilComputes the ceiling of the given value ofetoscaledecimal places.- Parameters:
- e- (undocumented)
- scale- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.3.0
 
- 
ceilComputes the ceiling of the given value ofeto 0 decimal places.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
ceilComputes the ceiling of the given value ofeto 0 decimal places.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
ceilingComputes the ceiling of the given value ofetoscaledecimal places.- Parameters:
- e- (undocumented)
- scale- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
ceilingComputes the ceiling of the given value ofeto 0 decimal places.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
convConvert a number in a string column from one base to another.- Parameters:
- num- (undocumented)
- fromBase- (undocumented)
- toBase- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
cos- Parameters:
- e- angle in radians
- Returns:
- cosine of the angle, as if computed by java.lang.Math.cos
- Since:
- 1.4.0
 
- 
cos- Parameters:
- columnName- angle in radians
- Returns:
- cosine of the angle, as if computed by java.lang.Math.cos
- Since:
- 1.4.0
 
- 
cosh- Parameters:
- e- hyperbolic angle
- Returns:
- hyperbolic cosine of the angle, as if computed by java.lang.Math.cosh
- Since:
- 1.4.0
 
- 
cosh- Parameters:
- columnName- hyperbolic angle
- Returns:
- hyperbolic cosine of the angle, as if computed by java.lang.Math.cosh
- Since:
- 1.4.0
 
- 
cot- Parameters:
- e- angle in radians
- Returns:
- cotangent of the angle
- Since:
- 3.3.0
 
- 
csc- Parameters:
- e- angle in radians
- Returns:
- cosecant of the angle
- Since:
- 3.3.0
 
- 
eReturns Euler's number.- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
expComputes the exponential of the given value.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
expComputes the exponential of the given column.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
expm1Computes the exponential of the given value minus one.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
expm1Computes the exponential of the given column minus one.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
factorialComputes the factorial of the given value.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
floorComputes the floor of the given value ofetoscaledecimal places.- Parameters:
- e- (undocumented)
- scale- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.3.0
 
- 
floorComputes the floor of the given value ofeto 0 decimal places.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
floorComputes the floor of the given column value to 0 decimal places.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
greatestReturns the greatest value of the list of values, skipping null values. This function takes at least 2 parameters. It will return null iff all parameters are null.- Parameters:
- exprs- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
greatestpublic static Column greatest(String columnName, scala.collection.immutable.Seq<String> columnNames) Returns the greatest value of the list of column names, skipping null values. This function takes at least 2 parameters. It will return null iff all parameters are null.- Parameters:
- columnName- (undocumented)
- columnNames- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
hexComputes hex value of the given column.- Parameters:
- column- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
unhexInverse of hex. Interprets each pair of characters as a hexadecimal number and converts to the byte representation of number.- Parameters:
- column- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
hypotComputessqrt(a^2^ + b^2^)without intermediate overflow or underflow.- Parameters:
- l- (undocumented)
- r- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
hypotComputessqrt(a^2^ + b^2^)without intermediate overflow or underflow.- Parameters:
- l- (undocumented)
- rightName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
hypotComputessqrt(a^2^ + b^2^)without intermediate overflow or underflow.- Parameters:
- leftName- (undocumented)
- r- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
hypotComputessqrt(a^2^ + b^2^)without intermediate overflow or underflow.- Parameters:
- leftName- (undocumented)
- rightName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
hypotComputessqrt(a^2^ + b^2^)without intermediate overflow or underflow.- Parameters:
- l- (undocumented)
- r- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
hypotComputessqrt(a^2^ + b^2^)without intermediate overflow or underflow.- Parameters:
- leftName- (undocumented)
- r- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
hypotComputessqrt(a^2^ + b^2^)without intermediate overflow or underflow.- Parameters:
- l- (undocumented)
- r- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
hypotComputessqrt(a^2^ + b^2^)without intermediate overflow or underflow.- Parameters:
- l- (undocumented)
- rightName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
leastReturns the least value of the list of values, skipping null values. This function takes at least 2 parameters. It will return null iff all parameters are null.- Parameters:
- exprs- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
leastReturns the least value of the list of column names, skipping null values. This function takes at least 2 parameters. It will return null iff all parameters are null.- Parameters:
- columnName- (undocumented)
- columnNames- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
lnComputes the natural logarithm of the given value.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
logComputes the natural logarithm of the given value.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
logComputes the natural logarithm of the given column.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
logReturns the first argument-base logarithm of the second argument.- Parameters:
- base- (undocumented)
- a- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
logReturns the first argument-base logarithm of the second argument.- Parameters:
- base- (undocumented)
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
log10Computes the logarithm of the given value in base 10.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
log10Computes the logarithm of the given value in base 10.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
log1pComputes the natural logarithm of the given value plus one.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
log1pComputes the natural logarithm of the given column plus one.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
log2Computes the logarithm of the given column in base 2.- Parameters:
- expr- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
log2Computes the logarithm of the given value in base 2.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
negativeReturns the negated value.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
piReturns Pi.- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
positiveReturns the value.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
powReturns the value of the first argument raised to the power of the second argument.- Parameters:
- l- (undocumented)
- r- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
powReturns the value of the first argument raised to the power of the second argument.- Parameters:
- l- (undocumented)
- rightName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
powReturns the value of the first argument raised to the power of the second argument.- Parameters:
- leftName- (undocumented)
- r- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
powReturns the value of the first argument raised to the power of the second argument.- Parameters:
- leftName- (undocumented)
- rightName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
powReturns the value of the first argument raised to the power of the second argument.- Parameters:
- l- (undocumented)
- r- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
powReturns the value of the first argument raised to the power of the second argument.- Parameters:
- leftName- (undocumented)
- r- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
powReturns the value of the first argument raised to the power of the second argument.- Parameters:
- l- (undocumented)
- r- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
powReturns the value of the first argument raised to the power of the second argument.- Parameters:
- l- (undocumented)
- rightName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
powerReturns the value of the first argument raised to the power of the second argument.- Parameters:
- l- (undocumented)
- r- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
pmodReturns the positive value of dividend mod divisor.- Parameters:
- dividend- (undocumented)
- divisor- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
rintReturns the double value that is closest in value to the argument and is equal to a mathematical integer.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
rintReturns the double value that is closest in value to the argument and is equal to a mathematical integer.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
roundReturns the value of the columnerounded to 0 decimal places with HALF_UP round mode.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
roundRound the value ofetoscaledecimal places with HALF_UP round mode ifscaleis greater than or equal to 0 or at integral part whenscaleis less than 0.- Parameters:
- e- (undocumented)
- scale- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
roundRound the value ofetoscaledecimal places with HALF_UP round mode ifscaleis greater than or equal to 0 or at integral part whenscaleis less than 0.- Parameters:
- e- (undocumented)
- scale- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
broundReturns the value of the columnerounded to 0 decimal places with HALF_EVEN round mode.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.0.0
 
- 
broundRound the value ofetoscaledecimal places with HALF_EVEN round mode ifscaleis greater than or equal to 0 or at integral part whenscaleis less than 0.- Parameters:
- e- (undocumented)
- scale- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.0.0
 
- 
broundRound the value ofetoscaledecimal places with HALF_EVEN round mode ifscaleis greater than or equal to 0 or at integral part whenscaleis less than 0.- Parameters:
- e- (undocumented)
- scale- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
sec- Parameters:
- e- angle in radians
- Returns:
- secant of the angle
- Since:
- 3.3.0
 
- 
shiftLeftDeprecated.Use shiftleft. Since 3.2.0.Shift the given value numBits left. If the given value is a long value, this function will return a long value else it will return an integer value.- Parameters:
- e- (undocumented)
- numBits- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
shiftleftShift the given value numBits left. If the given value is a long value, this function will return a long value else it will return an integer value.- Parameters:
- e- (undocumented)
- numBits- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.2.0
 
- 
shiftRightDeprecated.Use shiftright. Since 3.2.0.(Signed) shift the given value numBits right. If the given value is a long value, it will return a long value else it will return an integer value.- Parameters:
- e- (undocumented)
- numBits- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
shiftright(Signed) shift the given value numBits right. If the given value is a long value, it will return a long value else it will return an integer value.- Parameters:
- e- (undocumented)
- numBits- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.2.0
 
- 
shiftRightUnsignedDeprecated.Use shiftrightunsigned. Since 3.2.0.Unsigned shift the given value numBits right. If the given value is a long value, it will return a long value else it will return an integer value.- Parameters:
- e- (undocumented)
- numBits- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
shiftrightunsignedUnsigned shift the given value numBits right. If the given value is a long value, it will return a long value else it will return an integer value.- Parameters:
- e- (undocumented)
- numBits- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.2.0
 
- 
signComputes the signum of the given value.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
signumComputes the signum of the given value.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
signumComputes the signum of the given column.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
sin- Parameters:
- e- angle in radians
- Returns:
- sine of the angle, as if computed by java.lang.Math.sin
- Since:
- 1.4.0
 
- 
sin- Parameters:
- columnName- angle in radians
- Returns:
- sine of the angle, as if computed by java.lang.Math.sin
- Since:
- 1.4.0
 
- 
sinh- Parameters:
- e- hyperbolic angle
- Returns:
- hyperbolic sine of the given value, as if computed by java.lang.Math.sinh
- Since:
- 1.4.0
 
- 
sinh- Parameters:
- columnName- hyperbolic angle
- Returns:
- hyperbolic sine of the given value, as if computed by java.lang.Math.sinh
- Since:
- 1.4.0
 
- 
tan- Parameters:
- e- angle in radians
- Returns:
- tangent of the given value, as if computed by java.lang.Math.tan
- Since:
- 1.4.0
 
- 
tan- Parameters:
- columnName- angle in radians
- Returns:
- tangent of the given value, as if computed by java.lang.Math.tan
- Since:
- 1.4.0
 
- 
tanh- Parameters:
- e- hyperbolic angle
- Returns:
- hyperbolic tangent of the given value, as if computed by java.lang.Math.tanh
- Since:
- 1.4.0
 
- 
tanh- Parameters:
- columnName- hyperbolic angle
- Returns:
- hyperbolic tangent of the given value, as if computed by java.lang.Math.tanh
- Since:
- 1.4.0
 
- 
toDegreesDeprecated.Use degrees. Since 2.1.0.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
toDegreesDeprecated.Use degrees. Since 2.1.0.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
degreesConverts an angle measured in radians to an approximately equivalent angle measured in degrees.- Parameters:
- e- angle in radians
- Returns:
- angle in degrees, as if computed by java.lang.Math.toDegrees
- Since:
- 2.1.0
 
- 
degreesConverts an angle measured in radians to an approximately equivalent angle measured in degrees.- Parameters:
- columnName- angle in radians
- Returns:
- angle in degrees, as if computed by java.lang.Math.toDegrees
- Since:
- 2.1.0
 
- 
toRadiansDeprecated.Use radians. Since 2.1.0.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
toRadiansDeprecated.Use radians. Since 2.1.0.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.4.0
 
- 
radiansConverts an angle measured in degrees to an approximately equivalent angle measured in radians.- Parameters:
- e- angle in degrees
- Returns:
- angle in radians, as if computed by java.lang.Math.toRadians
- Since:
- 2.1.0
 
- 
radiansConverts an angle measured in degrees to an approximately equivalent angle measured in radians.- Parameters:
- columnName- angle in degrees
- Returns:
- angle in radians, as if computed by java.lang.Math.toRadians
- Since:
- 2.1.0
 
- 
width_bucketReturns the bucket number into which the value of this expression would fall after being evaluated. Note that input arguments must follow conditions listed below; otherwise, the method will return null.- Parameters:
- v- value to compute a bucket number in the histogram
- min- minimum value of the histogram
- max- maximum value of the histogram
- numBucket- the number of buckets
- Returns:
- the bucket number into which the value would fall after being evaluated
- Since:
- 3.5.0
 
- 
current_catalogReturns the current catalog.- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
current_databaseReturns the current database.- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
current_schemaReturns the current schema.- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
current_userReturns the user name of current execution context.- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
md5Calculates the MD5 digest of a binary column and returns the value as a 32 character hex string.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
sha1Calculates the SHA-1 digest of a binary column and returns the value as a 40 character hex string.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
sha2Calculates the SHA-2 family of hash functions of a binary column and returns the value as a hex string.- Parameters:
- e- column to compute SHA-2 on.
- numBits- one of 224, 256, 384, or 512.
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
crc32Calculates the cyclic redundancy check value (CRC32) of a binary column and returns the value as a bigint.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
hashCalculates the hash code of given columns, and returns the result as an int column.- Parameters:
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.0.0
 
- 
xxhash64Calculates the hash code of given columns using the 64-bit variant of the xxHash algorithm, and returns the result as a long column. The hash computation uses an initial seed of 42.- Parameters:
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
assert_trueReturns null if the condition is true, and throws an exception otherwise.- Parameters:
- c- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.1.0
 
- 
assert_trueReturns null if the condition is true; throws an exception with the error message otherwise.- Parameters:
- c- (undocumented)
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.1.0
 
- 
raise_errorThrows an exception with the provided error message.- Parameters:
- c- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.1.0
 
- 
hll_sketch_estimateReturns the estimated number of unique values given the binary representation of a Datasketches HllSketch.- Parameters:
- c- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
hll_sketch_estimateReturns the estimated number of unique values given the binary representation of a Datasketches HllSketch.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
hll_unionMerges two binary representations of Datasketches HllSketch objects, using a Datasketches Union object. Throws an exception if sketches have different lgConfigK values.- Parameters:
- c1- (undocumented)
- c2- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
hll_unionMerges two binary representations of Datasketches HllSketch objects, using a Datasketches Union object. Throws an exception if sketches have different lgConfigK values.- Parameters:
- columnName1- (undocumented)
- columnName2- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
hll_unionMerges two binary representations of Datasketches HllSketch objects, using a Datasketches Union object. Throws an exception if sketches have different lgConfigK values and allowDifferentLgConfigK is set to false.- Parameters:
- c1- (undocumented)
- c2- (undocumented)
- allowDifferentLgConfigK- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
hll_unionpublic static Column hll_union(String columnName1, String columnName2, boolean allowDifferentLgConfigK) Merges two binary representations of Datasketches HllSketch objects, using a Datasketches Union object. Throws an exception if sketches have different lgConfigK values and allowDifferentLgConfigK is set to false.- Parameters:
- columnName1- (undocumented)
- columnName2- (undocumented)
- allowDifferentLgConfigK- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
theta_differenceSubtracts two binary representations of Datasketches ThetaSketch objects in the input columns using a Datasketches AnotB object- Parameters:
- c1- (undocumented)
- c2- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
theta_differenceSubtracts two binary representations of Datasketches ThetaSketch objects in the input columns using a Datasketches AnotB object- Parameters:
- columnName1- (undocumented)
- columnName2- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
theta_intersectionIntersects two binary representations of Datasketches ThetaSketch objects in the input columns using a Datasketches Intersection object- Parameters:
- c1- (undocumented)
- c2- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
theta_intersectionIntersects two binary representations of Datasketches ThetaSketch objects in the input columns using a Datasketches Intersection object- Parameters:
- columnName1- (undocumented)
- columnName2- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
theta_sketch_estimateReturns the estimated number of unique values given the binary representation of a Datasketches ThetaSketch.- Parameters:
- c- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
theta_sketch_estimateReturns the estimated number of unique values given the binary representation of a Datasketches ThetaSketch.- Parameters:
- columnName- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
theta_unionUnions two binary representations of Datasketches ThetaSketch objects in the input columns using a Datasketches Union object. It is configured with the default value of 12 forlgNomEntries.- Parameters:
- c1- (undocumented)
- c2- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
theta_unionUnions two binary representations of Datasketches ThetaSketch objects in the input columns using a Datasketches Union object. It is configured with the default value of 12 forlgNomEntries.- Parameters:
- columnName1- (undocumented)
- columnName2- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
theta_unionUnions two binary representations of Datasketches ThetaSketch objects in the input columns using a Datasketches Union object. It allows the configuration oflgNomEntrieslog nominal entries for the union buffer.- Parameters:
- c1- (undocumented)
- c2- (undocumented)
- lgNomEntries- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
theta_unionUnions two binary representations of Datasketches ThetaSketch objects in the input columns using a Datasketches Union object. It allows the configuration oflgNomEntrieslog nominal entries for the union buffer.- Parameters:
- columnName1- (undocumented)
- columnName2- (undocumented)
- lgNomEntries- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
theta_unionUnions two binary representations of Datasketches ThetaSketch objects in the input columns using a Datasketches Union object. It allows the configuration oflgNomEntrieslog nominal entries for the union buffer.- Parameters:
- c1- (undocumented)
- c2- (undocumented)
- lgNomEntries- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
userReturns the user name of current execution context.- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
session_userReturns the user name of current execution context.- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
uuidReturns an universally unique identifier (UUID) string. The value is returned as a canonical UUID 36-character string.- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
uuidReturns an universally unique identifier (UUID) string. The value is returned as a canonical UUID 36-character string.- Parameters:
- seed- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
aes_encryptpublic static Column aes_encrypt(Column input, Column key, Column mode, Column padding, Column iv, Column aad) Returns an encrypted value ofinputusing AES in givenmodewith the specifiedpadding. Key lengths of 16, 24 and 32 bits are supported. Supported combinations of (mode,padding) are ('ECB', 'PKCS'), ('GCM', 'NONE') and ('CBC', 'PKCS'). Optional initialization vectors (IVs) are only supported for CBC and GCM modes. These must be 16 bytes for CBC and 12 bytes for GCM. If not provided, a random vector will be generated and prepended to the output. Optional additional authenticated data (AAD) is only supported for GCM. If provided for encryption, the identical AAD value must be provided for decryption. The default mode is GCM.- Parameters:
- input- The binary value to encrypt.
- key- The passphrase to use to encrypt the data.
- mode- Specifies which block cipher mode should be used to encrypt messages. Valid modes: ECB, GCM, CBC.
- padding- Specifies how to pad messages whose length is not a multiple of the block size. Valid values: PKCS, NONE, DEFAULT. The DEFAULT padding means PKCS for ECB, NONE for GCM and PKCS for CBC.
- iv- Optional initialization vector. Only supported for CBC and GCM modes. Valid values: None or "". 16-byte array for CBC mode. 12-byte array for GCM mode.
- aad- Optional additional authenticated data. Only supported for GCM mode. This can be any free-form input and must be provided for both encryption and decryption.
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
aes_encryptReturns an encrypted value ofinput.- Parameters:
- input- (undocumented)
- key- (undocumented)
- mode- (undocumented)
- padding- (undocumented)
- iv- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
- See Also:
- 
- org.apache.spark.sql.functions.aes_encrypt(Column, Column, Column, Column, Column, Column)
 
 
- 
aes_encryptReturns an encrypted value ofinput.- Parameters:
- input- (undocumented)
- key- (undocumented)
- mode- (undocumented)
- padding- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
- See Also:
- 
- org.apache.spark.sql.functions.aes_encrypt(Column, Column, Column, Column, Column, Column)
 
 
- 
aes_encryptReturns an encrypted value ofinput.- Parameters:
- input- (undocumented)
- key- (undocumented)
- mode- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
- See Also:
- 
- org.apache.spark.sql.functions.aes_encrypt(Column, Column, Column, Column, Column, Column)
 
 
- 
aes_encryptReturns an encrypted value ofinput.- Parameters:
- input- (undocumented)
- key- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
- See Also:
- 
- org.apache.spark.sql.functions.aes_encrypt(Column, Column, Column, Column, Column, Column)
 
 
- 
aes_decryptReturns a decrypted value ofinputusing AES inmodewithpadding. Key lengths of 16, 24 and 32 bits are supported. Supported combinations of (mode,padding) are ('ECB', 'PKCS'), ('GCM', 'NONE') and ('CBC', 'PKCS'). Optional additional authenticated data (AAD) is only supported for GCM. If provided for encryption, the identical AAD value must be provided for decryption. The default mode is GCM.- Parameters:
- input- The binary value to decrypt.
- key- The passphrase to use to decrypt the data.
- mode- Specifies which block cipher mode should be used to decrypt messages. Valid modes: ECB, GCM, CBC.
- padding- Specifies how to pad messages whose length is not a multiple of the block size. Valid values: PKCS, NONE, DEFAULT. The DEFAULT padding means PKCS for ECB, NONE for GCM and PKCS for CBC.
- aad- Optional additional authenticated data. Only supported for GCM mode. This can be any free-form input and must be provided for both encryption and decryption.
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
aes_decryptReturns a decrypted value ofinput.- Parameters:
- input- (undocumented)
- key- (undocumented)
- mode- (undocumented)
- padding- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
- See Also:
- 
- org.apache.spark.sql.functions.aes_decrypt(Column, Column, Column, Column, Column)
 
 
- 
aes_decryptReturns a decrypted value ofinput.- Parameters:
- input- (undocumented)
- key- (undocumented)
- mode- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
- See Also:
- 
- org.apache.spark.sql.functions.aes_decrypt(Column, Column, Column, Column, Column)
 
 
- 
aes_decryptReturns a decrypted value ofinput.- Parameters:
- input- (undocumented)
- key- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
- See Also:
- 
- org.apache.spark.sql.functions.aes_decrypt(Column, Column, Column, Column, Column)
 
 
- 
try_aes_decryptpublic static Column try_aes_decrypt(Column input, Column key, Column mode, Column padding, Column aad) This is a special version ofaes_decryptthat performs the same operation, but returns a NULL value instead of raising an error if the decryption cannot be performed.- Parameters:
- input- The binary value to decrypt.
- key- The passphrase to use to decrypt the data.
- mode- Specifies which block cipher mode should be used to decrypt messages. Valid modes: ECB, GCM, CBC.
- padding- Specifies how to pad messages whose length is not a multiple of the block size. Valid values: PKCS, NONE, DEFAULT. The DEFAULT padding means PKCS for ECB, NONE for GCM and PKCS for CBC.
- aad- Optional additional authenticated data. Only supported for GCM mode. This can be any free-form input and must be provided for both encryption and decryption.
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
try_aes_decryptReturns a decrypted value ofinput.- Parameters:
- input- (undocumented)
- key- (undocumented)
- mode- (undocumented)
- padding- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
- See Also:
- 
- org.apache.spark.sql.functions.try_aes_decrypt(Column, Column, Column, Column, Column)
 
 
- 
try_aes_decryptReturns a decrypted value ofinput.- Parameters:
- input- (undocumented)
- key- (undocumented)
- mode- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
- See Also:
- 
- org.apache.spark.sql.functions.try_aes_decrypt(Column, Column, Column, Column, Column)
 
 
- 
try_aes_decryptReturns a decrypted value ofinput.- Parameters:
- input- (undocumented)
- key- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
- See Also:
- 
- org.apache.spark.sql.functions.try_aes_decrypt(Column, Column, Column, Column, Column)
 
 
- 
shaReturns a sha1 hash value as a hex string of thecol.- Parameters:
- col- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
input_file_block_lengthReturns the length of the block being read, or -1 if not available.- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
input_file_block_startReturns the start offset of the block being read, or -1 if not available.- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
reflectCalls a method with reflection.- Parameters:
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
java_methodCalls a method with reflection.- Parameters:
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
try_reflectThis is a special version ofreflectthat performs the same operation, but returns a NULL value instead of raising an error if the invoke method thrown exception.- Parameters:
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
versionReturns the Spark version. The string contains 2 fields, the first being a release version and the second being a git revision.- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
typeofReturn DDL-formatted type string for the data type of the input.- Parameters:
- col- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
stackSeparatescol1, ...,colkintonrows. Uses column names col0, col1, etc. by default unless specified otherwise.- Parameters:
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
uniformReturns a random value with independent and identically distributed (i.i.d.) values with the specified range of numbers. The provided numbers specifying the minimum and maximum values of the range must be constant. If both of these numbers are integers, then the result will also be an integer. Otherwise if one or both of these are floating-point numbers, then the result will also be a floating-point number.- Parameters:
- min- (undocumented)
- max- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
uniformReturns a random value with independent and identically distributed (i.i.d.) values with the specified range of numbers, with the chosen random seed. The provided numbers specifying the minimum and maximum values of the range must be constant. If both of these numbers are integers, then the result will also be an integer. Otherwise if one or both of these are floating-point numbers, then the result will also be a floating-point number.- Parameters:
- min- (undocumented)
- max- (undocumented)
- seed- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
randomReturns a random value with independent and identically distributed (i.i.d.) uniformly distributed values in [0, 1).- Parameters:
- seed- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
randomReturns a random value with independent and identically distributed (i.i.d.) uniformly distributed values in [0, 1).- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
bitmap_bit_positionReturns the bucket number for the given input column.- Parameters:
- col- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
bitmap_bucket_numberReturns the bit position for the given input column.- Parameters:
- col- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
bitmap_construct_aggReturns a bitmap with the positions of the bits set from all the values from the input column. The input column will most likely be bitmap_bit_position().- Parameters:
- col- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
bitmap_countReturns the number of set bits in the input bitmap.- Parameters:
- col- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
bitmap_or_aggReturns a bitmap that is the bitwise OR of all of the bitmaps from the input column. The input column should be bitmaps created from bitmap_construct_agg().- Parameters:
- col- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
bitmap_and_aggReturns a bitmap that is the bitwise AND of all of the bitmaps from the input column. The input column should be bitmaps created from bitmap_construct_agg().- Parameters:
- col- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
asciiComputes the numeric value of the first character of the string column, and returns the result as an int column.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
base64Computes the BASE64 encoding of a binary column and returns it as a string column. This is the reverse of unbase64.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
bit_lengthCalculates the bit length for the specified string column.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.3.0
 
- 
concat_wsConcatenates multiple input string columns together into a single string column, using the given separator.- Parameters:
- sep- (undocumented)
- exprs- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
- Note:
- Input strings which are null are skipped.
 
- 
decodeComputes the first argument into a string from a binary using the provided character set (one of 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16', 'UTF-32'). If either argument is null, the result will also be null.- Parameters:
- value- (undocumented)
- charset- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
encodeComputes the first argument into a binary from a string using the provided character set (one of 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16', 'UTF-32'). If either argument is null, the result will also be null.- Parameters:
- value- (undocumented)
- charset- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
is_valid_utf8Returns true if the input is a valid UTF-8 string, otherwise returns false.- Parameters:
- str- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
make_valid_utf8Returns a new string in which all invalid UTF-8 byte sequences, if any, are replaced by the Unicode replacement character (U+FFFD).- Parameters:
- str- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
validate_utf8Returns the input value if it corresponds to a valid UTF-8 string, or emits a SparkIllegalArgumentException exception otherwise.- Parameters:
- str- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
try_validate_utf8Returns the input value if it corresponds to a valid UTF-8 string, or NULL otherwise.- Parameters:
- str- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
format_numberFormats numeric column x to a format like '#,###,###.##', rounded to d decimal places with HALF_EVEN round mode, and returns the result as a string column.If d is 0, the result has no decimal point or fractional part. If d is less than 0, the result will be null. - Parameters:
- x- (undocumented)
- d- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
format_stringFormats the arguments in printf-style and returns the result as a string column.- Parameters:
- format- (undocumented)
- arguments- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
initcapReturns a new string column by converting the first letter of each word to uppercase. Words are delimited by whitespace.For example, "hello world" will become "Hello World". - Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
instrLocate the position of the first occurrence of substr column in the given string. Returns null if either of the arguments are null.- Parameters:
- str- (undocumented)
- substring- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
- Note:
- The position is not zero based, but 1 based index. Returns 0 if substr could not be found in str.
 
- 
instrLocate the position of the first occurrence of substr column in the given string. Returns null if either of the arguments are null.- Parameters:
- str- (undocumented)
- substring- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
- Note:
- The position is not zero based, but 1 based index. Returns 0 if substr could not be found in str.
 
- 
lengthComputes the character length of a given string or number of bytes of a binary string. The length of character strings include the trailing spaces. The length of binary strings includes binary zeros.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
lenComputes the character length of a given string or number of bytes of a binary string. The length of character strings include the trailing spaces. The length of binary strings includes binary zeros.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
lowerConverts a string column to lower case.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
levenshteinComputes the Levenshtein distance of the two given string columns if it's less than or equal to a given threshold.- Parameters:
- l- (undocumented)
- r- (undocumented)
- threshold- (undocumented)
- Returns:
- result distance, or -1
- Since:
- 3.5.0
 
- 
levenshteinComputes the Levenshtein distance of the two given string columns.- Parameters:
- l- (undocumented)
- r- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
locateLocate the position of the first occurrence of substr.- Parameters:
- substr- (undocumented)
- str- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
- Note:
- The position is not zero based, but 1 based index. Returns 0 if substr could not be found in str.
 
- 
locateLocate the position of the first occurrence of substr in a string column, after position pos.- Parameters:
- substr- (undocumented)
- str- (undocumented)
- pos- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
- Note:
- The position is not zero based, but 1 based index. returns 0 if substr could not be found in str.
 
- 
lpadLeft-pad the string column with pad to a length of len. If the string column is longer than len, the return value is shortened to len characters.- Parameters:
- str- (undocumented)
- len- (undocumented)
- pad- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
lpadLeft-pad the binary column with pad to a byte length of len. If the binary column is longer than len, the return value is shortened to len bytes.- Parameters:
- str- (undocumented)
- len- (undocumented)
- pad- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.3.0
 
- 
lpadLeft-pad the string column with pad to a length of len. If the string column is longer than len, the return value is shortened to len characters.- Parameters:
- str- (undocumented)
- len- (undocumented)
- pad- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
ltrimTrim the spaces from left end for the specified string value.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
ltrimTrim the specified character string from left end for the specified string column.- Parameters:
- e- (undocumented)
- trimString- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.3.0
 
- 
ltrimTrim the specified character string from left end for the specified string column.- Parameters:
- e- (undocumented)
- trim- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
octet_lengthCalculates the byte length for the specified string column.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.3.0
 
- 
collateMarks a given column with specified collation.- Parameters:
- e- (undocumented)
- collation- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
collationReturns the collation name of a given column.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
rlikeReturns true ifstrmatchesregexp, or false otherwise.- Parameters:
- str- (undocumented)
- regexp- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
regexpReturns true ifstrmatchesregexp, or false otherwise.- Parameters:
- str- (undocumented)
- regexp- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
regexp_likeReturns true ifstrmatchesregexp, or false otherwise.- Parameters:
- str- (undocumented)
- regexp- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
regexp_countReturns a count of the number of times that the regular expression patternregexpis matched in the stringstr.- Parameters:
- str- (undocumented)
- regexp- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
regexp_extractExtract a specific group matched by a Java regex, from the specified string column. If the regex did not match, or the specified group did not match, an empty string is returned. if the specified group index exceeds the group count of regex, an IllegalArgumentException will be thrown.- Parameters:
- e- (undocumented)
- exp- (undocumented)
- groupIdx- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
regexp_extract_allExtract all strings in thestrthat match theregexpexpression and corresponding to the first regex group index.- Parameters:
- str- (undocumented)
- regexp- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
regexp_extract_allExtract all strings in thestrthat match theregexpexpression and corresponding to the regex group index.- Parameters:
- str- (undocumented)
- regexp- (undocumented)
- idx- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
regexp_replaceReplace all substrings of the specified string value that match regexp with rep.- Parameters:
- e- (undocumented)
- pattern- (undocumented)
- replacement- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
regexp_replaceReplace all substrings of the specified string value that match regexp with rep.- Parameters:
- e- (undocumented)
- pattern- (undocumented)
- replacement- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.1.0
 
- 
regexp_substrReturns the substring that matches the regular expressionregexpwithin the stringstr. If the regular expression is not found, the result is null.- Parameters:
- str- (undocumented)
- regexp- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
regexp_instrSearches a string for a regular expression and returns an integer that indicates the beginning position of the matched substring. Positions are 1-based, not 0-based. If no match is found, returns 0.- Parameters:
- str- (undocumented)
- regexp- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
regexp_instrSearches a string for a regular expression and returns an integer that indicates the beginning position of the matched substring. Positions are 1-based, not 0-based. If no match is found, returns 0.- Parameters:
- str- (undocumented)
- regexp- (undocumented)
- idx- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
unbase64Decodes a BASE64 encoded string column and returns it as a binary column. This is the reverse of base64.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
rpadRight-pad the string column with pad to a length of len. If the string column is longer than len, the return value is shortened to len characters.- Parameters:
- str- (undocumented)
- len- (undocumented)
- pad- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
rpadRight-pad the binary column with pad to a byte length of len. If the binary column is longer than len, the return value is shortened to len bytes.- Parameters:
- str- (undocumented)
- len- (undocumented)
- pad- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.3.0
 
- 
rpadRight-pad the string column with pad to a length of len. If the string column is longer than len, the return value is shortened to len characters.- Parameters:
- str- (undocumented)
- len- (undocumented)
- pad- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
repeatRepeats a string column n times, and returns it as a new string column.- Parameters:
- str- (undocumented)
- n- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
repeatRepeats a string column n times, and returns it as a new string column.- Parameters:
- str- (undocumented)
- n- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
rtrimTrim the spaces from right end for the specified string value.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
rtrimTrim the specified character string from right end for the specified string column.- Parameters:
- e- (undocumented)
- trimString- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.3.0
 
- 
rtrimTrim the specified character string from right end for the specified string column.- Parameters:
- e- (undocumented)
- trim- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
soundexReturns the soundex code for the specified expression.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
splitSplits str around matches of the given pattern.- Parameters:
- str- a string expression to split
- pattern- a string representing a regular expression. The regex string should be a Java regular expression.
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
splitSplits str around matches of the given pattern.- Parameters:
- str- a string expression to split
- pattern- a column of string representing a regular expression. The regex string should be a Java regular expression.
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
splitSplits str around matches of the given pattern.- Parameters:
- str- a string expression to split
- pattern- a string representing a regular expression. The regex string should be a Java regular expression.
- limit- an integer expression which controls the number of times the regex is applied.- limit greater than 0: The resulting array's length will not be more than limit, and the resulting array's last entry will contain all input beyond the last matched regex.
- limit less than or equal to 0: regexwill be applied as many times as possible, and the resulting array can be of any size.
 
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
splitSplits str around matches of the given pattern.- Parameters:
- str- a string expression to split
- pattern- a column of string representing a regular expression. The regex string should be a Java regular expression.
- limit- a column of integer expression which controls the number of times the regex is applied.- limit greater than 0: The resulting array's length will not be more than limit, and the resulting array's last entry will contain all input beyond the last matched regex.
- limit less than or equal to 0: regexwill be applied as many times as possible, and the resulting array can be of any size.
 
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
substringSubstring starts atposand is of lengthlenwhen str is String type or returns the slice of byte array that starts atposin byte and is of lengthlenwhen str is Binary type- Parameters:
- str- (undocumented)
- pos- (undocumented)
- len- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
- Note:
- The position is not zero based, but 1 based index.
 
- 
substringSubstring starts atposand is of lengthlenwhen str is String type or returns the slice of byte array that starts atposin byte and is of lengthlenwhen str is Binary type- Parameters:
- str- (undocumented)
- pos- (undocumented)
- len- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
- Note:
- The position is not zero based, but 1 based index.
 
- 
substring_indexReturns the substring from string str before count occurrences of the delimiter delim. If count is positive, everything the left of the final delimiter (counting from left) is returned. If count is negative, every to the right of the final delimiter (counting from the right) is returned. substring_index performs a case-sensitive match when searching for delim.- Parameters:
- str- (undocumented)
- delim- (undocumented)
- count- (undocumented)
- Returns:
- (undocumented)
 
- 
overlayOverlay the specified portion ofsrcwithreplace, starting from byte positionposofsrcand proceeding forlenbytes.- Parameters:
- src- (undocumented)
- replace- (undocumented)
- pos- (undocumented)
- len- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
overlayOverlay the specified portion ofsrcwithreplace, starting from byte positionposofsrc.- Parameters:
- src- (undocumented)
- replace- (undocumented)
- pos- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
sentencesSplits a string into arrays of sentences, where each sentence is an array of words.- Parameters:
- string- (undocumented)
- language- (undocumented)
- country- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.2.0
 
- 
sentencesSplits a string into arrays of sentences, where each sentence is an array of words. The defaultcountry('') is used.- Parameters:
- string- (undocumented)
- language- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
sentencesSplits a string into arrays of sentences, where each sentence is an array of words. The default locale is used.- Parameters:
- string- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.2.0
 
- 
translateTranslate any character in the src by a character in replaceString. The characters in replaceString correspond to the characters in matchingString. The translate will happen when any character in the string matches the character in thematchingString.- Parameters:
- src- (undocumented)
- matchingString- (undocumented)
- replaceString- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
trimTrim the spaces from both ends for the specified string column.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
trimTrim the specified character from both ends for the specified string column.- Parameters:
- e- (undocumented)
- trimString- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.3.0
 
- 
trimTrim the specified character from both ends for the specified string column.- Parameters:
- e- (undocumented)
- trim- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
upperConverts a string column to upper case.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
to_binaryConverts the inputeto a binary value based on the suppliedformat. Theformatcan be a case-insensitive string literal of "hex", "utf-8", "utf8", or "base64". By default, the binary format for conversion is "hex" ifformatis omitted. The function returns NULL if at least one of the input parameters is NULL.- Parameters:
- e- (undocumented)
- f- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
to_binaryConverts the inputeto a binary value based on the default format "hex". The function returns NULL if at least one of the input parameters is NULL.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
to_charConverteto a string based on theformat. Throws an exception if the conversion fails. The format can consist of the following characters, case insensitive: '0' or '9': Specifies an expected digit between 0 and 9. A sequence of 0 or 9 in the format string matches a sequence of digits in the input value, generating a result string of the same length as the corresponding sequence in the format string. The result string is left-padded with zeros if the 0/9 sequence comprises more digits than the matching part of the decimal value, starts with 0, and is before the decimal point. Otherwise, it is padded with spaces. '.' or 'D': Specifies the position of the decimal point (optional, only allowed once). ',' or 'G': Specifies the position of the grouping (thousands) separator (,). There must be a 0 or 9 to the left and right of each grouping separator. '$': Specifies the location of the $ currency sign. This character may only be specified once. 'S' or 'MI': Specifies the position of a '-' or '+' sign (optional, only allowed once at the beginning or end of the format string). Note that 'S' prints '+' for positive values but 'MI' prints a space. 'PR': Only allowed at the end of the format string; specifies that the result string will be wrapped by angle brackets if the input value is negative.If eis a datetime,formatshall be a valid datetime pattern, see Datetime Patterns. Ifeis a binary, it is converted to a string in one of the formats: 'base64': a base 64 string. 'hex': a string in the hexadecimal format. 'utf-8': the input binary is decoded to UTF-8 string.- Parameters:
- e- (undocumented)
- format- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
to_varcharConverteto a string based on theformat. Throws an exception if the conversion fails. The format can consist of the following characters, case insensitive: '0' or '9': Specifies an expected digit between 0 and 9. A sequence of 0 or 9 in the format string matches a sequence of digits in the input value, generating a result string of the same length as the corresponding sequence in the format string. The result string is left-padded with zeros if the 0/9 sequence comprises more digits than the matching part of the decimal value, starts with 0, and is before the decimal point. Otherwise, it is padded with spaces. '.' or 'D': Specifies the position of the decimal point (optional, only allowed once). ',' or 'G': Specifies the position of the grouping (thousands) separator (,). There must be a 0 or 9 to the left and right of each grouping separator. '$': Specifies the location of the $ currency sign. This character may only be specified once. 'S' or 'MI': Specifies the position of a '-' or '+' sign (optional, only allowed once at the beginning or end of the format string). Note that 'S' prints '+' for positive values but 'MI' prints a space. 'PR': Only allowed at the end of the format string; specifies that the result string will be wrapped by angle brackets if the input value is negative.If eis a datetime,formatshall be a valid datetime pattern, see Datetime Patterns. Ifeis a binary, it is converted to a string in one of the formats: 'base64': a base 64 string. 'hex': a string in the hexadecimal format. 'utf-8': the input binary is decoded to UTF-8 string.- Parameters:
- e- (undocumented)
- format- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
to_numberConvert string 'e' to a number based on the string format 'format'. Throws an exception if the conversion fails. The format can consist of the following characters, case insensitive: '0' or '9': Specifies an expected digit between 0 and 9. A sequence of 0 or 9 in the format string matches a sequence of digits in the input string. If the 0/9 sequence starts with 0 and is before the decimal point, it can only match a digit sequence of the same size. Otherwise, if the sequence starts with 9 or is after the decimal point, it can match a digit sequence that has the same or smaller size. '.' or 'D': Specifies the position of the decimal point (optional, only allowed once). ',' or 'G': Specifies the position of the grouping (thousands) separator (,). There must be a 0 or 9 to the left and right of each grouping separator. 'expr' must match the grouping separator relevant for the size of the number. '$': Specifies the location of the $ currency sign. This character may only be specified once. 'S' or 'MI': Specifies the position of a '-' or '+' sign (optional, only allowed once at the beginning or end of the format string). Note that 'S' allows '-' but 'MI' does not. 'PR': Only allowed at the end of the format string; specifies that 'expr' indicates a negative number with wrapping angled brackets.- Parameters:
- e- (undocumented)
- format- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
replaceReplaces all occurrences ofsearchwithreplace.- Parameters:
- src- A column of string to be replaced
- search- A column of string, If- searchis not found in- str,- stris returned unchanged.
- replace- A column of string, If- replaceis not specified or is an empty string, nothing replaces the string that is removed from- str.
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
replaceReplaces all occurrences ofsearchwithreplace.- Parameters:
- src- A column of string to be replaced
- search- A column of string, If- searchis not found in- src,- srcis returned unchanged.
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
split_partSplitsstrby delimiter and return requested part of the split (1-based). If any input is null, returns null. ifpartNumis out of range of split parts, returns empty string. IfpartNumis 0, throws an error. IfpartNumis negative, the parts are counted backward from the end of the string. If thedelimiteris an empty string, thestris not split.- Parameters:
- str- (undocumented)
- delimiter- (undocumented)
- partNum- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
substrReturns the substring ofstrthat starts atposand is of lengthlen, or the slice of byte array that starts atposand is of lengthlen.- Parameters:
- str- (undocumented)
- pos- (undocumented)
- len- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
substrReturns the substring ofstrthat starts atpos, or the slice of byte array that starts atpos.- Parameters:
- str- (undocumented)
- pos- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
try_parse_urlExtracts a part from a URL.- Parameters:
- url- (undocumented)
- partToExtract- (undocumented)
- key- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
try_parse_urlExtracts a part from a URL.- Parameters:
- url- (undocumented)
- partToExtract- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
parse_urlExtracts a part from a URL.- Parameters:
- url- (undocumented)
- partToExtract- (undocumented)
- key- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
parse_urlExtracts a part from a URL.- Parameters:
- url- (undocumented)
- partToExtract- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
printfFormats the arguments in printf-style and returns the result as a string column.- Parameters:
- format- (undocumented)
- arguments- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
url_decodeDecodes astrin 'application/x-www-form-urlencoded' format using a specific encoding scheme.- Parameters:
- str- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
try_url_decodeThis is a special version ofurl_decodethat performs the same operation, but returns a NULL value instead of raising an error if the decoding cannot be performed.- Parameters:
- str- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
url_encodeTranslates a string into 'application/x-www-form-urlencoded' format using a specific encoding scheme.- Parameters:
- str- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
positionReturns the position of the first occurrence ofsubstrinstrafter positionstart. The givenstartand return value are 1-based.- Parameters:
- substr- (undocumented)
- str- (undocumented)
- start- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
positionReturns the position of the first occurrence ofsubstrinstrafter position1. The return value are 1-based.- Parameters:
- substr- (undocumented)
- str- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
endswithReturns a boolean. The value is True if str ends with suffix. Returns NULL if either input expression is NULL. Otherwise, returns False. Both str or suffix must be of STRING or BINARY type.- Parameters:
- str- (undocumented)
- suffix- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
startswithReturns a boolean. The value is True if str starts with prefix. Returns NULL if either input expression is NULL. Otherwise, returns False. Both str or prefix must be of STRING or BINARY type.- Parameters:
- str- (undocumented)
- prefix- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
btrimRemoves the leading and trailing space characters fromstr.- Parameters:
- str- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
btrimRemove the leading and trailingtrimcharacters fromstr.- Parameters:
- str- (undocumented)
- trim- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
try_to_binaryThis is a special version ofto_binarythat performs the same operation, but returns a NULL value instead of raising an error if the conversion cannot be performed.- Parameters:
- e- (undocumented)
- f- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
try_to_binaryThis is a special version ofto_binarythat performs the same operation, but returns a NULL value instead of raising an error if the conversion cannot be performed.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
try_to_numberConvert stringeto a number based on the string formatformat. Returns NULL if the stringedoes not match the expected format. The format follows the same semantics as the to_number function.- Parameters:
- e- (undocumented)
- format- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
char_lengthReturns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros.- Parameters:
- str- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
character_lengthReturns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros.- Parameters:
- str- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
chrReturns the ASCII character having the binary equivalent ton. If n is larger than 256 the result is equivalent to chr(n % 256)- Parameters:
- n- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
containsReturns a boolean. The value is True if right is found inside left. Returns NULL if either input expression is NULL. Otherwise, returns False. Both left or right must be of STRING or BINARY type.- Parameters:
- left- (undocumented)
- right- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
eltReturns then-th input, e.g., returnsinput2whennis 2. The function returns NULL if the index exceeds the length of the array andspark.sql.ansi.enabledis set to false. Ifspark.sql.ansi.enabledis set to true, it throws ArrayIndexOutOfBoundsException for invalid indices.- Parameters:
- inputs- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
find_in_setReturns the index (1-based) of the given string (str) in the comma-delimited list (strArray). Returns 0, if the string was not found or if the given string (str) contains a comma.- Parameters:
- str- (undocumented)
- strArray- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
likeReturns true if str matchespatternwithescapeChar, null if any arguments are null, false otherwise.- Parameters:
- str- (undocumented)
- pattern- (undocumented)
- escapeChar- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
likeReturns true if str matchespatternwithescapeChar('\'), null if any arguments are null, false otherwise.- Parameters:
- str- (undocumented)
- pattern- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
ilikeReturns true if str matchespatternwithescapeCharcase-insensitively, null if any arguments are null, false otherwise.- Parameters:
- str- (undocumented)
- pattern- (undocumented)
- escapeChar- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
ilikeReturns true if str matchespatternwithescapeChar('\') case-insensitively, null if any arguments are null, false otherwise.- Parameters:
- str- (undocumented)
- pattern- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
lcaseReturnsstrwith all characters changed to lowercase.- Parameters:
- str- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
ucaseReturnsstrwith all characters changed to uppercase.- Parameters:
- str- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
leftReturns the leftmostlen(lencan be string type) characters from the stringstr, iflenis less or equal than 0 the result is an empty string.- Parameters:
- str- (undocumented)
- len- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
rightReturns the rightmostlen(lencan be string type) characters from the stringstr, iflenis less or equal than 0 the result is an empty string.- Parameters:
- str- (undocumented)
- len- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
quoteReturnsstrenclosed by single quotes and each instance of single quote in it is preceded by a backslash.- Parameters:
- str- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
add_monthsReturns the date that isnumMonthsafterstartDate.- Parameters:
- startDate- A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- numMonths- The number of months to add to- startDate, can be negative to subtract months
- Returns:
- A date, or null if startDatewas a string that could not be cast to a date
- Since:
- 1.5.0
 
- 
add_monthsReturns the date that isnumMonthsafterstartDate.- Parameters:
- startDate- A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- numMonths- A column of the number of months to add to- startDate, can be negative to subtract months
- Returns:
- A date, or null if startDatewas a string that could not be cast to a date
- Since:
- 3.0.0
 
- 
curdateReturns the current date at the start of query evaluation as a date column. All calls of current_date within the same query return the same value.- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
current_dateReturns the current date at the start of query evaluation as a date column. All calls of current_date within the same query return the same value.- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
current_timezoneReturns the current session local timezone.- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
current_timestampReturns the current timestamp at the start of query evaluation as a timestamp column. All calls of current_timestamp within the same query return the same value.- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
nowReturns the current timestamp at the start of query evaluation.- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
localtimestampReturns the current timestamp without time zone at the start of query evaluation as a timestamp without time zone column. All calls of localtimestamp within the same query return the same value.- Returns:
- (undocumented)
- Since:
- 3.3.0
 
- 
date_formatConverts a date/timestamp/string to a value of string in the format specified by the date format given by the second argument.See Datetime Patterns for valid date and time format patterns - Parameters:
- dateExpr- A date, timestamp or string. If a string, the data must be in a format that can be cast to a timestamp, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- format- A pattern- dd.MM.yyyywould return a string like- 18.03.1993
- Returns:
- A string, or null if dateExprwas a string that could not be cast to a timestamp
- Throws:
- IllegalArgumentException- if the- formatpattern is invalid
- Since:
- 1.5.0
- Note:
- Use specialized functions like year(org.apache.spark.sql.Column)whenever possible as they benefit from a specialized implementation.
 
- 
date_addReturns the date that isdaysdays afterstart- Parameters:
- start- A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- days- The number of days to add to- start, can be negative to subtract days
- Returns:
- A date, or null if startwas a string that could not be cast to a date
- Since:
- 1.5.0
 
- 
date_addReturns the date that isdaysdays afterstart- Parameters:
- start- A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- days- A column of the number of days to add to- start, can be negative to subtract days
- Returns:
- A date, or null if startwas a string that could not be cast to a date
- Since:
- 3.0.0
 
- 
dateaddReturns the date that isdaysdays afterstart- Parameters:
- start- A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- days- A column of the number of days to add to- start, can be negative to subtract days
- Returns:
- A date, or null if startwas a string that could not be cast to a date
- Since:
- 3.5.0
 
- 
date_subReturns the date that isdaysdays beforestart- Parameters:
- start- A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- days- The number of days to subtract from- start, can be negative to add days
- Returns:
- A date, or null if startwas a string that could not be cast to a date
- Since:
- 1.5.0
 
- 
date_subReturns the date that isdaysdays beforestart- Parameters:
- start- A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- days- A column of the number of days to subtract from- start, can be negative to add days
- Returns:
- A date, or null if startwas a string that could not be cast to a date
- Since:
- 3.0.0
 
- 
datediffReturns the number of days fromstarttoend.Only considers the date part of the input. For example: dateddiff("2018-01-10 00:00:00", "2018-01-09 23:59:59") // returns 1- Parameters:
- end- A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- start- A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- Returns:
- An integer, or null if either endorstartwere strings that could not be cast to a date. Negative ifendis beforestart
- Since:
- 1.5.0
 
- 
date_diffReturns the number of days fromstarttoend.Only considers the date part of the input. For example: dateddiff("2018-01-10 00:00:00", "2018-01-09 23:59:59") // returns 1- Parameters:
- end- A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- start- A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- Returns:
- An integer, or null if either endorstartwere strings that could not be cast to a date. Negative ifendis beforestart
- Since:
- 3.5.0
 
- 
date_from_unix_dateCreate date from the number ofdayssince 1970-01-01.- Parameters:
- days- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
yearExtracts the year as an integer from a given date/timestamp/string.- Parameters:
- e- (undocumented)
- Returns:
- An integer, or null if the input was a string that could not be cast to a date
- Since:
- 1.5.0
 
- 
quarterExtracts the quarter as an integer from a given date/timestamp/string.- Parameters:
- e- (undocumented)
- Returns:
- An integer, or null if the input was a string that could not be cast to a date
- Since:
- 1.5.0
 
- 
monthExtracts the month as an integer from a given date/timestamp/string.- Parameters:
- e- (undocumented)
- Returns:
- An integer, or null if the input was a string that could not be cast to a date
- Since:
- 1.5.0
 
- 
dayofweekExtracts the day of the week as an integer from a given date/timestamp/string. Ranges from 1 for a Sunday through to 7 for a Saturday- Parameters:
- e- (undocumented)
- Returns:
- An integer, or null if the input was a string that could not be cast to a date
- Since:
- 2.3.0
 
- 
dayofmonthExtracts the day of the month as an integer from a given date/timestamp/string.- Parameters:
- e- (undocumented)
- Returns:
- An integer, or null if the input was a string that could not be cast to a date
- Since:
- 1.5.0
 
- 
dayExtracts the day of the month as an integer from a given date/timestamp/string.- Parameters:
- e- (undocumented)
- Returns:
- An integer, or null if the input was a string that could not be cast to a date
- Since:
- 3.5.0
 
- 
dayofyearExtracts the day of the year as an integer from a given date/timestamp/string.- Parameters:
- e- (undocumented)
- Returns:
- An integer, or null if the input was a string that could not be cast to a date
- Since:
- 1.5.0
 
- 
hourExtracts the hours as an integer from a given date/time/timestamp/string.- Parameters:
- e- (undocumented)
- Returns:
- An integer, or null if the input was a string that could not be cast to a date
- Since:
- 1.5.0
 
- 
extractExtracts a part of the date/timestamp or interval source.- Parameters:
- field- selects which part of the source should be extracted.
- source- a date/timestamp or interval column from where- fieldshould be extracted.
- Returns:
- a part of the date/timestamp or interval source
- Since:
- 3.5.0
 
- 
date_partExtracts a part of the date/timestamp or interval source.- Parameters:
- field- selects which part of the source should be extracted, and supported string values are as same as the fields of the equivalent function- extract.
- source- a date/timestamp or interval column from where- fieldshould be extracted.
- Returns:
- a part of the date/timestamp or interval source
- Since:
- 3.5.0
 
- 
datepartExtracts a part of the date/timestamp or interval source.- Parameters:
- field- selects which part of the source should be extracted, and supported string values are as same as the fields of the equivalent function- EXTRACT.
- source- a date/timestamp or interval column from where- fieldshould be extracted.
- Returns:
- a part of the date/timestamp or interval source
- Since:
- 3.5.0
 
- 
last_dayReturns the last day of the month which the given date belongs to. For example, input "2015-07-27" returns "2015-07-31" since July 31 is the last day of the month in July 2015.- Parameters:
- e- A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- Returns:
- A date, or null if the input was a string that could not be cast to a date
- Since:
- 1.5.0
 
- 
minuteExtracts the minutes as an integer from a given date/time/timestamp/string.- Parameters:
- e- (undocumented)
- Returns:
- An integer, or null if the input was a string that could not be cast to a date
- Since:
- 1.5.0
 
- 
weekdayReturns the day of the week for date/timestamp (0 = Monday, 1 = Tuesday, ..., 6 = Sunday).- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
make_date- Parameters:
- year- (undocumented)
- month- (undocumented)
- day- (undocumented)
- Returns:
- A date created from year, month and day fields.
- Since:
- 3.3.0
 
- 
months_betweenReturns number of months between datesstartandend.A whole number is returned if both inputs have the same day of month or both are the last day of their respective months. Otherwise, the difference is calculated assuming 31 days per month. For example: months_between("2017-11-14", "2017-07-14") // returns 4.0 months_between("2017-01-01", "2017-01-10") // returns 0.29032258 months_between("2017-06-01", "2017-06-16 12:00:00") // returns -0.5- Parameters:
- end- A date, timestamp or string. If a string, the data must be in a format that can be cast to a timestamp, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- start- A date, timestamp or string. If a string, the data must be in a format that can cast to a timestamp, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- Returns:
- A double, or null if either endorstartwere strings that could not be cast to a timestamp. Negative ifendis beforestart
- Since:
- 1.5.0
 
- 
months_betweenReturns number of months between datesendandstart. IfroundOffis set to true, the result is rounded off to 8 digits; it is not rounded otherwise.- Parameters:
- end- (undocumented)
- start- (undocumented)
- roundOff- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
next_dayReturns the first date which is later than the value of thedatecolumn that is on the specified day of the week.For example, next_day('2015-07-27', "Sunday")returns 2015-08-02 because that is the first Sunday after 2015-07-27.- Parameters:
- date- A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- dayOfWeek- Case insensitive, and accepts: "Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"
- Returns:
- A date, or null if datewas a string that could not be cast to a date or ifdayOfWeekwas an invalid value
- Since:
- 1.5.0
 
- 
next_dayReturns the first date which is later than the value of thedatecolumn that is on the specified day of the week.For example, next_day('2015-07-27', "Sunday")returns 2015-08-02 because that is the first Sunday after 2015-07-27.- Parameters:
- date- A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- dayOfWeek- A column of the day of week. Case insensitive, and accepts: "Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"
- Returns:
- A date, or null if datewas a string that could not be cast to a date or ifdayOfWeekwas an invalid value
- Since:
- 3.2.0
 
- 
secondExtracts the seconds as an integer from a given date/time/timestamp/string.- Parameters:
- e- (undocumented)
- Returns:
- An integer, or null if the input was a string that could not be cast to a timestamp
- Since:
- 1.5.0
 
- 
weekofyearExtracts the week number as an integer from a given date/timestamp/string.A week is considered to start on a Monday and week 1 is the first week with more than 3 days, as defined by ISO 8601 - Parameters:
- e- (undocumented)
- Returns:
- An integer, or null if the input was a string that could not be cast to a date
- Since:
- 1.5.0
 
- 
from_unixtimeConverts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the yyyy-MM-dd HH:mm:ss format.- Parameters:
- ut- A number of a type that is castable to a long, such as string or integer. Can be negative for timestamps before the unix epoch
- Returns:
- A string, or null if the input was a string that could not be cast to a long
- Since:
- 1.5.0
 
- 
from_unixtimeConverts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the given format.See Datetime Patterns for valid date and time format patterns - Parameters:
- ut- A number of a type that is castable to a long, such as string or integer. Can be negative for timestamps before the unix epoch
- f- A date time pattern that the input will be formatted to
- Returns:
- A string, or null if utwas a string that could not be cast to a long orfwas an invalid date time pattern
- Since:
- 1.5.0
 
- 
unix_timestampReturns the current Unix timestamp (in seconds) as a long.- Returns:
- (undocumented)
- Since:
- 1.5.0
- Note:
- All calls of unix_timestampwithin the same query return the same value (i.e. the current timestamp is calculated at the start of query evaluation).
 
- 
unix_timestampConverts time string in format yyyy-MM-dd HH:mm:ss to Unix timestamp (in seconds), using the default timezone and the default locale.- Parameters:
- s- A date, timestamp or string. If a string, the data must be in the- yyyy-MM-dd HH:mm:ssformat
- Returns:
- A long, or null if the input was a string not of the correct format
- Since:
- 1.5.0
 
- 
unix_timestampConverts time string with given pattern to Unix timestamp (in seconds).See Datetime Patterns for valid date and time format patterns - Parameters:
- s- A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- p- A date time pattern detailing the format of- swhen- sis a string
- Returns:
- A long, or null if swas a string that could not be cast to a date orpwas an invalid format
- Since:
- 1.5.0
 
- 
to_timeParses a string value to a time value.- Parameters:
- str- A string to be parsed to time.
- Returns:
- A time, or raises an error if the input is malformed.
- Since:
- 4.1.0
 
- 
to_timeParses a string value to a time value.See Datetime Patterns for valid time format patterns. - Parameters:
- str- A string to be parsed to time.
- format- A time format pattern to follow.
- Returns:
- A time, or raises an error if the input is malformed.
- Since:
- 4.1.0
 
- 
to_timestampConverts to a timestamp by casting rules toTimestampType.- Parameters:
- s- A date, timestamp or string. If a string, the data must be in a format that can be cast to a timestamp, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- Returns:
- A timestamp, or null if the input was a string that could not be cast to a timestamp
- Since:
- 2.2.0
 
- 
to_timestampConverts time string with the given pattern to timestamp.See Datetime Patterns for valid date and time format patterns - Parameters:
- s- A date, timestamp or string. If a string, the data must be in a format that can be cast to a timestamp, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- fmt- A date time pattern detailing the format of- swhen- sis a string
- Returns:
- A timestamp, or null if swas a string that could not be cast to a timestamp orfmtwas an invalid format
- Since:
- 2.2.0
 
- 
try_to_timeParses a string value to a time value.- Parameters:
- str- A string to be parsed to time.
- Returns:
- A time, or null if the input is malformed.
- Since:
- 4.1.0
 
- 
try_to_timeParses a string value to a time value.See Datetime Patterns for valid time format patterns. - Parameters:
- str- A string to be parsed to time.
- format- A time format pattern to follow.
- Returns:
- A time, or null if the input is malformed.
- Since:
- 4.1.0
 
- 
try_to_timestampParses theswith theformatto a timestamp. The function always returns null on an invalid input with/without ANSI SQL mode enabled. The result data type is consistent with the value of configurationspark.sql.timestampType.- Parameters:
- s- (undocumented)
- format- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
try_to_timestampParses thesto a timestamp. The function always returns null on an invalid input with/without ANSI SQL mode enabled. It follows casting rules to a timestamp. The result data type is consistent with the value of configurationspark.sql.timestampType.- Parameters:
- s- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
to_dateConverts the column intoDateTypeby casting rules toDateType.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
to_dateConverts the column into aDateTypewith a specified formatSee Datetime Patterns for valid date and time format patterns - Parameters:
- e- A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- fmt- A date time pattern detailing the format of- ewhen- eis a string
- Returns:
- A date, or null if ewas a string that could not be cast to a date orfmtwas an invalid format
- Since:
- 2.2.0
 
- 
try_to_dateThis is a special version ofto_datethat performs the same operation, but returns a NULL value instead of raising an error if date cannot be created.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
try_to_dateThis is a special version ofto_datethat performs the same operation, but returns a NULL value instead of raising an error if date cannot be created.- Parameters:
- e- (undocumented)
- fmt- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
unix_dateReturns the number of days since 1970-01-01.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
unix_microsReturns the number of microseconds since 1970-01-01 00:00:00 UTC.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
unix_millisReturns the number of milliseconds since 1970-01-01 00:00:00 UTC. Truncates higher levels of precision.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
unix_secondsReturns the number of seconds since 1970-01-01 00:00:00 UTC. Truncates higher levels of precision.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
truncReturns date truncated to the unit specified by the format.For example, trunc("2018-11-19 12:01:19", "year")returns 2018-01-01- Parameters:
- date- A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- format- : 'year', 'yyyy', 'yy' to truncate by year, or 'month', 'mon', 'mm' to truncate by month Other options are: 'week', 'quarter'
- Returns:
- A date, or null if datewas a string that could not be cast to a date orformatwas an invalid value
- Since:
- 1.5.0
 
- 
date_truncReturns timestamp truncated to the unit specified by the format.For example, date_trunc("year", "2018-11-19 12:01:19")returns 2018-01-01 00:00:00- Parameters:
- format- : 'year', 'yyyy', 'yy' to truncate by year, 'month', 'mon', 'mm' to truncate by month, 'day', 'dd' to truncate by day, Other options are: 'microsecond', 'millisecond', 'second', 'minute', 'hour', 'week', 'quarter'
- timestamp- A date, timestamp or string. If a string, the data must be in a format that can be cast to a timestamp, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- Returns:
- A timestamp, or null if timestampwas a string that could not be cast to a timestamp orformatwas an invalid value
- Since:
- 2.3.0
 
- 
from_utc_timestampGiven a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone. For example, 'GMT+1' would yield '2017-07-14 03:40:00.0'.- Parameters:
- ts- A date, timestamp or string. If a string, the data must be in a format that can be cast to a timestamp, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- tz- A string detailing the time zone ID that the input should be adjusted to. It should be in the format of either region-based zone IDs or zone offsets. Region IDs must have the form 'area/city', such as 'America/Los_Angeles'. Zone offsets must be in the format '(+|-)HH:mm', for example '-08:00' or '+01:00'. Also 'UTC' and 'Z' are supported as aliases of '+00:00'. Other short names are not recommended to use because they can be ambiguous.
- Returns:
- A timestamp, or null if tswas a string that could not be cast to a timestamp ortzwas an invalid value
- Since:
- 1.5.0
 
- 
from_utc_timestampGiven a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone. For example, 'GMT+1' would yield '2017-07-14 03:40:00.0'.- Parameters:
- ts- (undocumented)
- tz- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
to_utc_timestampGiven a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC. For example, 'GMT+1' would yield '2017-07-14 01:40:00.0'.- Parameters:
- ts- A date, timestamp or string. If a string, the data must be in a format that can be cast to a timestamp, such as- yyyy-MM-ddor- yyyy-MM-dd HH:mm:ss.SSSS
- tz- A string detailing the time zone ID that the input should be adjusted to. It should be in the format of either region-based zone IDs or zone offsets. Region IDs must have the form 'area/city', such as 'America/Los_Angeles'. Zone offsets must be in the format '(+|-)HH:mm', for example '-08:00' or '+01:00'. Also 'UTC' and 'Z' are supported as aliases of '+00:00'. Other short names are not recommended to use because they can be ambiguous.
- Returns:
- A timestamp, or null if tswas a string that could not be cast to a timestamp ortzwas an invalid value
- Since:
- 1.5.0
 
- 
to_utc_timestampGiven a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC. For example, 'GMT+1' would yield '2017-07-14 01:40:00.0'.- Parameters:
- ts- (undocumented)
- tz- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
windowpublic static Column window(Column timeColumn, String windowDuration, String slideDuration, String startTime) Bucketize rows into one or more time windows given a timestamp specifying column. Window starts are inclusive but the window ends are exclusive, e.g. 12:05 will be in the window [12:05,12:10) but not in [12:00,12:05). Windows can support microsecond precision. Windows in the order of months are not supported. The following example takes the average stock price for a one minute window every 10 seconds starting 5 seconds after the hour:val df = ... // schema => timestamp: TimestampType, stockId: StringType, price: DoubleType df.groupBy(window($"timestamp", "1 minute", "10 seconds", "5 seconds"), $"stockId") .agg(mean("price"))The windows will look like: 09:00:05-09:01:05 09:00:15-09:01:15 09:00:25-09:01:25 ...For a streaming query, you may use the function current_timestampto generate windows on processing time.- Parameters:
- timeColumn- The column or the expression to use as the timestamp for windowing by time. The time column must be of TimestampType or TimestampNTZType.
- windowDuration- A string specifying the width of the window, e.g.- 10 minutes,- 1 second. Check- org.apache.spark.unsafe.types.CalendarIntervalfor valid duration identifiers. Note that the duration is a fixed length of time, and does not vary over time according to a calendar. For example,- 1 dayalways means 86,400,000 milliseconds, not a calendar day.
- slideDuration- A string specifying the sliding interval of the window, e.g.- 1 minute. A new window will be generated every- slideDuration. Must be less than or equal to the- windowDuration. Check- org.apache.spark.unsafe.types.CalendarIntervalfor valid duration identifiers. This duration is likewise absolute, and does not vary according to a calendar.
- startTime- The offset with respect to 1970-01-01 00:00:00 UTC with which to start window intervals. For example, in order to have hourly tumbling windows that start 15 minutes past the hour, e.g. 12:15-13:15, 13:15-14:15... provide- startTimeas- 15 minutes.
- Returns:
- (undocumented)
- Since:
- 2.0.0
 
- 
windowBucketize rows into one or more time windows given a timestamp specifying column. Window starts are inclusive but the window ends are exclusive, e.g. 12:05 will be in the window [12:05,12:10) but not in [12:00,12:05). Windows can support microsecond precision. Windows in the order of months are not supported. The windows start beginning at 1970-01-01 00:00:00 UTC. The following example takes the average stock price for a one minute window every 10 seconds:val df = ... // schema => timestamp: TimestampType, stockId: StringType, price: DoubleType df.groupBy(window($"timestamp", "1 minute", "10 seconds"), $"stockId") .agg(mean("price"))The windows will look like: 09:00:00-09:01:00 09:00:10-09:01:10 09:00:20-09:01:20 ...For a streaming query, you may use the function current_timestampto generate windows on processing time.- Parameters:
- timeColumn- The column or the expression to use as the timestamp for windowing by time. The time column must be of TimestampType or TimestampNTZType.
- windowDuration- A string specifying the width of the window, e.g.- 10 minutes,- 1 second. Check- org.apache.spark.unsafe.types.CalendarIntervalfor valid duration identifiers. Note that the duration is a fixed length of time, and does not vary over time according to a calendar. For example,- 1 dayalways means 86,400,000 milliseconds, not a calendar day.
- slideDuration- A string specifying the sliding interval of the window, e.g.- 1 minute. A new window will be generated every- slideDuration. Must be less than or equal to the- windowDuration. Check- org.apache.spark.unsafe.types.CalendarIntervalfor valid duration identifiers. This duration is likewise absolute, and does not vary according to a calendar.
- Returns:
- (undocumented)
- Since:
- 2.0.0
 
- 
windowGenerates tumbling time windows given a timestamp specifying column. Window starts are inclusive but the window ends are exclusive, e.g. 12:05 will be in the window [12:05,12:10) but not in [12:00,12:05). Windows can support microsecond precision. Windows in the order of months are not supported. The windows start beginning at 1970-01-01 00:00:00 UTC. The following example takes the average stock price for a one minute tumbling window:val df = ... // schema => timestamp: TimestampType, stockId: StringType, price: DoubleType df.groupBy(window($"timestamp", "1 minute"), $"stockId") .agg(mean("price"))The windows will look like: 09:00:00-09:01:00 09:01:00-09:02:00 09:02:00-09:03:00 ...For a streaming query, you may use the function current_timestampto generate windows on processing time.- Parameters:
- timeColumn- The column or the expression to use as the timestamp for windowing by time. The time column must be of TimestampType or TimestampNTZType.
- windowDuration- A string specifying the width of the window, e.g.- 10 minutes,- 1 second. Check- org.apache.spark.unsafe.types.CalendarIntervalfor valid duration identifiers.
- Returns:
- (undocumented)
- Since:
- 2.0.0
 
- 
window_timeExtracts the event time from the window column.The window column is of StructType { start: Timestamp, end: Timestamp } where start is inclusive and end is exclusive. Since event time can support microsecond precision, window_time(window) = window.end - 1 microsecond. - Parameters:
- windowColumn- The window column (typically produced by window aggregation) of type StructType { start: Timestamp, end: Timestamp }
- Returns:
- (undocumented)
- Since:
- 3.4.0
 
- 
session_windowGenerates session window given a timestamp specifying column.Session window is one of dynamic windows, which means the length of window is varying according to the given inputs. The length of session window is defined as "the timestamp of latest input of the session + gap duration", so when the new inputs are bound to the current session window, the end time of session window can be expanded according to the new inputs. Windows can support microsecond precision. gapDuration in the order of months are not supported. For a streaming query, you may use the function current_timestampto generate windows on processing time.- Parameters:
- timeColumn- The column or the expression to use as the timestamp for windowing by time. The time column must be of TimestampType or TimestampNTZType.
- gapDuration- A string specifying the timeout of the session, e.g.- 10 minutes,- 1 second. Check- org.apache.spark.unsafe.types.CalendarIntervalfor valid duration identifiers.
- Returns:
- (undocumented)
- Since:
- 3.2.0
 
- 
session_windowGenerates session window given a timestamp specifying column.Session window is one of dynamic windows, which means the length of window is varying according to the given inputs. For static gap duration, the length of session window is defined as "the timestamp of latest input of the session + gap duration", so when the new inputs are bound to the current session window, the end time of session window can be expanded according to the new inputs. Besides a static gap duration value, users can also provide an expression to specify gap duration dynamically based on the input row. With dynamic gap duration, the closing of a session window does not depend on the latest input anymore. A session window's range is the union of all events' ranges which are determined by event start time and evaluated gap duration during the query execution. Note that the rows with negative or zero gap duration will be filtered out from the aggregation. Windows can support microsecond precision. gapDuration in the order of months are not supported. For a streaming query, you may use the function current_timestampto generate windows on processing time.- Parameters:
- timeColumn- The column or the expression to use as the timestamp for windowing by time. The time column must be of TimestampType or TimestampNTZType.
- gapDuration- A column specifying the timeout of the session. It could be static value, e.g.- 10 minutes,- 1 second, or an expression/UDF that specifies gap duration dynamically based on the input row.
- Returns:
- (undocumented)
- Since:
- 3.2.0
 
- 
timestamp_secondsConverts the number of seconds from the Unix epoch (1970-01-01T00:00:00Z) to a timestamp.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.1.0
 
- 
timestamp_millisCreates timestamp from the number of milliseconds since UTC epoch.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
timestamp_microsCreates timestamp from the number of microseconds since UTC epoch.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
timestamp_diffGets the difference between the timestamps in the specified units by truncating the fraction part.- Parameters:
- unit- (undocumented)
- start- (undocumented)
- end- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
timestamp_addAdds the specified number of units to the given timestamp.- Parameters:
- unit- (undocumented)
- quantity- (undocumented)
- ts- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
time_diffReturns the difference between two times, measured in specified units. Throws a SparkIllegalArgumentException, in case the specified unit is not supported.- Parameters:
- unit- A STRING representing the unit of the time difference. Supported units are: "HOUR", "MINUTE", "SECOND", "MILLISECOND", and "MICROSECOND". The unit is case-insensitive.
- start- A starting TIME.
- end- An ending TIME.
- Returns:
- The difference between endandstarttimes, measured in specified units.
- Since:
- 4.1.0
- Note:
- If any of the inputs is NULL, the result isNULL.
 
- 
time_truncReturnstimetruncated to theunit.- Parameters:
- unit- A STRING representing the unit to truncate the time to. Supported units are: "HOUR", "MINUTE", "SECOND", "MILLISECOND", and "MICROSECOND". The unit is case-insensitive.
- time- A TIME to truncate.
- Returns:
- A TIME truncated to the specified unit.
- Throws:
- IllegalArgumentException- If the- unitis not supported.
- Since:
- 4.1.0
- Note:
- If any of the inputs is NULL, the result isNULL.
 
- 
to_timestamp_ltzParses thetimestampexpression with theformatexpression to a timestamp without time zone. Returns null with invalid input.- Parameters:
- timestamp- (undocumented)
- format- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
to_timestamp_ltzParses thetimestampexpression with the default format to a timestamp without time zone. The default format follows casting rules to a timestamp. Returns null with invalid input.- Parameters:
- timestamp- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
to_timestamp_ntzParses thetimestamp_strexpression with theformatexpression to a timestamp without time zone. Returns null with invalid input.- Parameters:
- timestamp- (undocumented)
- format- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
to_timestamp_ntzParses thetimestampexpression with the default format to a timestamp without time zone. The default format follows casting rules to a timestamp. Returns null with invalid input.- Parameters:
- timestamp- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
to_unix_timestampReturns the UNIX timestamp of the given time.- Parameters:
- timeExp- (undocumented)
- format- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
to_unix_timestampReturns the UNIX timestamp of the given time.- Parameters:
- timeExp- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
monthnameExtracts the three-letter abbreviated month name from a given date/timestamp/string.- Parameters:
- timeExp- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
daynameExtracts the three-letter abbreviated day name from a given date/timestamp/string.- Parameters:
- timeExp- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
array_containsReturns null if the array is null, true if the array containsvalue, and false otherwise.- Parameters:
- column- (undocumented)
- value- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
array_appendReturns an ARRAY containing all elements from the source ARRAY as well as the new element. The new element/column is located at end of the ARRAY.- Parameters:
- column- (undocumented)
- element- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.4.0
 
- 
arrays_overlapReturnstrueifa1anda2have at least one non-null element in common. If not and both the arrays are non-empty and any of them contains anull, it returnsnull. It returnsfalseotherwise.- Parameters:
- a1- (undocumented)
- a2- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
sliceReturns an array containing all the elements inxfrom indexstart(or starting from the end ifstartis negative) with the specifiedlength.- Parameters:
- x- the array column to be sliced
- start- the starting index
- length- the length of the slice
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
sliceReturns an array containing all the elements inxfrom indexstart(or starting from the end ifstartis negative) with the specifiedlength.- Parameters:
- x- the array column to be sliced
- start- the starting index
- length- the length of the slice
- Returns:
- (undocumented)
- Since:
- 3.1.0
 
- 
array_joinConcatenates the elements ofcolumnusing thedelimiter. Null values are replaced withnullReplacement.- Parameters:
- column- (undocumented)
- delimiter- (undocumented)
- nullReplacement- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
array_joinConcatenates the elements ofcolumnusing thedelimiter.- Parameters:
- column- (undocumented)
- delimiter- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
concatConcatenates multiple input columns together into a single column. The function works with strings, binary and compatible array columns.- Parameters:
- exprs- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
- Note:
- Returns null if any of the input columns are null.
 
- 
array_positionLocates the position of the first occurrence of the value in the given array as long. Returns null if either of the arguments are null.- Parameters:
- column- (undocumented)
- value- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
- Note:
- The position is not zero based, but 1 based index. Returns 0 if value could not be found in array.
 
- 
element_atReturns element of array at given index in value if column is array. Returns value for the given key in value if column is map.- Parameters:
- column- (undocumented)
- value- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
try_element_at(array, index) - Returns element of array at given (1-based) index. If Index is 0, Spark will throw an error. If index < 0, accesses elements from the last to the first. The function always returns NULL if the index exceeds the length of the array.(map, key) - Returns value for given key. The function always returns NULL if the key is not contained in the map. - Parameters:
- column- (undocumented)
- value- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
getReturns element of array at given (0-based) index. If the index points outside of the array boundaries, then this function returns NULL.- Parameters:
- column- (undocumented)
- index- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.4.0
 
- 
array_sortSorts the input array in ascending order. The elements of the input array must be orderable. NaN is greater than any non-NaN elements for double/float type. Null elements will be placed at the end of the returned array.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
array_sortSorts the input array based on the given comparator function. The comparator will take two arguments representing two elements of the array. It returns a negative integer, 0, or a positive integer as the first element is less than, equal to, or greater than the second element. If the comparator function returns null, the function will fail and raise an error.- Parameters:
- e- (undocumented)
- comparator- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.4.0
 
- 
array_removeRemove all elements that equal to element from the given array.- Parameters:
- column- (undocumented)
- element- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
array_compactRemove all null elements from the given array.- Parameters:
- column- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.4.0
 
- 
array_prependReturns an array containing value as well as all elements from array. The new element is positioned at the beginning of the array.- Parameters:
- column- (undocumented)
- element- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
array_distinctRemoves duplicate values from the array.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
array_intersectReturns an array of the elements in the intersection of the given two arrays, without duplicates.- Parameters:
- col1- (undocumented)
- col2- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
array_insertAdds an item into a given array at a specified position- Parameters:
- arr- (undocumented)
- pos- (undocumented)
- value- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.4.0
 
- 
array_unionReturns an array of the elements in the union of the given two arrays, without duplicates.- Parameters:
- col1- (undocumented)
- col2- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
array_exceptReturns an array of the elements in the first array but not in the second array, without duplicates. The order of elements in the result is not determined- Parameters:
- col1- (undocumented)
- col2- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
transformReturns an array of elements after applying a transformation to each element in the input array.df.select(transform(col("i"), x => x + 1))- Parameters:
- column- the input array column
- f- col => transformed_col, the lambda function to transform the input column
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
transformReturns an array of elements after applying a transformation to each element in the input array.df.select(transform(col("i"), (x, i) => x + i))- Parameters:
- column- the input array column
- f- (col, index) => transformed_col, the lambda function to transform the input column given the index. Indices start at 0.
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
existsReturns whether a predicate holds for one or more elements in the array.df.select(exists(col("i"), _ % 2 === 0))- Parameters:
- column- the input array column
- f- col => predicate, the Boolean predicate to check the input column
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
forallReturns whether a predicate holds for every element in the array.df.select(forall(col("i"), x => x % 2 === 0))- Parameters:
- column- the input array column
- f- col => predicate, the Boolean predicate to check the input column
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
filterReturns an array of elements for which a predicate holds in a given array.df.select(filter(col("s"), x => x % 2 === 0))- Parameters:
- column- the input array column
- f- col => predicate, the Boolean predicate to filter the input column
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
filterReturns an array of elements for which a predicate holds in a given array.df.select(filter(col("s"), (x, i) => i % 2 === 0))- Parameters:
- column- the input array column
- f- (col, index) => predicate, the Boolean predicate to filter the input column given the index. Indices start at 0.
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
aggregatepublic static Column aggregate(Column expr, Column initialValue, scala.Function2<Column, Column, Column> merge, scala.Function1<Column, Column> finish) Applies a binary operator to an initial state and all elements in the array, and reduces this to a single state. The final state is converted into the final result by applying a finish function.df.select(aggregate(col("i"), lit(0), (acc, x) => acc + x, _ * 10))- Parameters:
- expr- the input array column
- initialValue- the initial value
- merge- (combined_value, input_value) => combined_value, the merge function to merge an input value to the combined_value
- finish- combined_value => final_value, the lambda function to convert the combined value of all inputs to final result
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
aggregatepublic static Column aggregate(Column expr, Column initialValue, scala.Function2<Column, Column, Column> merge) Applies a binary operator to an initial state and all elements in the array, and reduces this to a single state.df.select(aggregate(col("i"), lit(0), (acc, x) => acc + x))- Parameters:
- expr- the input array column
- initialValue- the initial value
- merge- (combined_value, input_value) => combined_value, the merge function to merge an input value to the combined_value
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
reducepublic static Column reduce(Column expr, Column initialValue, scala.Function2<Column, Column, Column> merge, scala.Function1<Column, Column> finish) Applies a binary operator to an initial state and all elements in the array, and reduces this to a single state. The final state is converted into the final result by applying a finish function.df.select(aggregate(col("i"), lit(0), (acc, x) => acc + x, _ * 10))- Parameters:
- expr- the input array column
- initialValue- the initial value
- merge- (combined_value, input_value) => combined_value, the merge function to merge an input value to the combined_value
- finish- combined_value => final_value, the lambda function to convert the combined value of all inputs to final result
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
reducepublic static Column reduce(Column expr, Column initialValue, scala.Function2<Column, Column, Column> merge) Applies a binary operator to an initial state and all elements in the array, and reduces this to a single state.df.select(aggregate(col("i"), lit(0), (acc, x) => acc + x))- Parameters:
- expr- the input array column
- initialValue- the initial value
- merge- (combined_value, input_value) => combined_value, the merge function to merge an input value to the combined_value
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
zip_withMerge two given arrays, element-wise, into a single array using a function. If one array is shorter, nulls are appended at the end to match the length of the longer array, before applying the function.df.select(zip_with(df1("val1"), df1("val2"), (x, y) => x + y))- Parameters:
- left- the left input array column
- right- the right input array column
- f- (lCol, rCol) => col, the lambda function to merge two input columns into one column
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
transform_keysApplies a function to every key-value pair in a map and returns a map with the results of those applications as the new keys for the pairs.df.select(transform_keys(col("i"), (k, v) => k + v))- Parameters:
- expr- the input map column
- f- (key, value) => new_key, the lambda function to transform the key of input map column
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
transform_valuesApplies a function to every key-value pair in a map and returns a map with the results of those applications as the new values for the pairs.df.select(transform_values(col("i"), (k, v) => k + v))- Parameters:
- expr- the input map column
- f- (key, value) => new_value, the lambda function to transform the value of input map column
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
map_filterReturns a map whose key-value pairs satisfy a predicate.df.select(map_filter(col("m"), (k, v) => k * 10 === v))- Parameters:
- expr- the input map column
- f- (key, value) => predicate, the Boolean predicate to filter the input map column
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
map_zip_withpublic static Column map_zip_with(Column left, Column right, scala.Function3<Column, Column, Column, Column> f) Merge two given maps, key-wise into a single map using a function.df.select(map_zip_with(df("m1"), df("m2"), (k, v1, v2) => k === v1 + v2))- Parameters:
- left- the left input map column
- right- the right input map column
- f- (key, value1, value2) => new_value, the lambda function to merge the map values
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
explodeCreates a new row for each element in the given array or map column. Uses the default column namecolfor elements in the array andkeyandvaluefor elements in the map unless specified otherwise.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
explode_outerCreates a new row for each element in the given array or map column. Uses the default column namecolfor elements in the array andkeyandvaluefor elements in the map unless specified otherwise. Unlike explode, if the array/map is null or empty then null is produced.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.2.0
 
- 
posexplodeCreates a new row for each element with position in the given array or map column. Uses the default column nameposfor position, andcolfor elements in the array andkeyandvaluefor elements in the map unless specified otherwise.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.1.0
 
- 
posexplode_outerCreates a new row for each element with position in the given array or map column. Uses the default column nameposfor position, andcolfor elements in the array andkeyandvaluefor elements in the map unless specified otherwise. Unlike posexplode, if the array/map is null or empty then the row (null, null) is produced.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.2.0
 
- 
inlineCreates a new row for each element in the given array of structs.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.4.0
 
- 
inline_outerCreates a new row for each element in the given array of structs. Unlike inline, if the array is null or empty then null is produced for each nested column.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.4.0
 
- 
get_json_objectExtracts json object from a json string based on json path specified, and returns json string of the extracted json object. It will return null if the input json string is invalid.- Parameters:
- e- (undocumented)
- path- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
json_tupleCreates a new row for a json column according to the given field names.- Parameters:
- json- (undocumented)
- fields- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.6.0
 
- 
from_jsonpublic static Column from_json(Column e, StructType schema, scala.collection.immutable.Map<String, String> options) (Scala-specific) Parses a column containing a JSON string into aStructTypewith the specified schema. Returnsnull, in the case of an unparseable string.- Parameters:
- e- a string column containing JSON data.
- schema- the schema to use when parsing the json string
- options- options to control how the json is parsed. Accepts the same options as the json data source. See Data Source Option in the version you use.
- Returns:
- (undocumented)
- Since:
- 2.1.0
 
- 
from_jsonpublic static Column from_json(Column e, DataType schema, scala.collection.immutable.Map<String, String> options) (Scala-specific) Parses a column containing a JSON string into aMapTypewithStringTypeas keys type,StructTypeorArrayTypewith the specified schema. Returnsnull, in the case of an unparseable string.- Parameters:
- e- a string column containing JSON data.
- schema- the schema to use when parsing the json string
- options- options to control how the json is parsed. accepts the same options and the json data source. See Data Source Option in the version you use.
- Returns:
- (undocumented)
- Since:
- 2.2.0
 
- 
from_json(Java-specific) Parses a column containing a JSON string into aStructTypewith the specified schema. Returnsnull, in the case of an unparseable string.- Parameters:
- e- a string column containing JSON data.
- schema- the schema to use when parsing the json string
- options- options to control how the json is parsed. accepts the same options and the json data source. See Data Source Option in the version you use.
- Returns:
- (undocumented)
- Since:
- 2.1.0
 
- 
from_json(Java-specific) Parses a column containing a JSON string into aMapTypewithStringTypeas keys type,StructTypeorArrayTypewith the specified schema. Returnsnull, in the case of an unparseable string.- Parameters:
- e- a string column containing JSON data.
- schema- the schema to use when parsing the json string
- options- options to control how the json is parsed. accepts the same options and the json data source. See Data Source Option in the version you use.
- Returns:
- (undocumented)
- Since:
- 2.2.0
 
- 
from_jsonParses a column containing a JSON string into aStructTypewith the specified schema. Returnsnull, in the case of an unparseable string.- Parameters:
- e- a string column containing JSON data.
- schema- the schema to use when parsing the json string
- Returns:
- (undocumented)
- Since:
- 2.1.0
 
- 
from_jsonParses a column containing a JSON string into aMapTypewithStringTypeas keys type,StructTypeorArrayTypewith the specified schema. Returnsnull, in the case of an unparseable string.- Parameters:
- e- a string column containing JSON data.
- schema- the schema to use when parsing the json string
- Returns:
- (undocumented)
- Since:
- 2.2.0
 
- 
from_json(Java-specific) Parses a column containing a JSON string into aMapTypewithStringTypeas keys type,StructTypeorArrayTypewith the specified schema. Returnsnull, in the case of an unparseable string.- Parameters:
- e- a string column containing JSON data.
- schema- the schema as a DDL-formatted string.
- options- options to control how the json is parsed. accepts the same options and the json data source. See Data Source Option in the version you use.
- Returns:
- (undocumented)
- Since:
- 2.1.0
 
- 
from_jsonpublic static Column from_json(Column e, String schema, scala.collection.immutable.Map<String, String> options) (Scala-specific) Parses a column containing a JSON string into aMapTypewithStringTypeas keys type,StructTypeorArrayTypewith the specified schema. Returnsnull, in the case of an unparseable string.- Parameters:
- e- a string column containing JSON data.
- schema- the schema as a DDL-formatted string.
- options- options to control how the json is parsed. accepts the same options and the json data source. See Data Source Option in the version you use.
- Returns:
- (undocumented)
- Since:
- 2.3.0
 
- 
from_json(Scala-specific) Parses a column containing a JSON string into aMapTypewithStringTypeas keys type,StructTypeorArrayTypeofStructTypes with the specified schema. Returnsnull, in the case of an unparseable string.- Parameters:
- e- a string column containing JSON data.
- schema- the schema to use when parsing the json string
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
from_json(Java-specific) Parses a column containing a JSON string into aMapTypewithStringTypeas keys type,StructTypeorArrayTypeofStructTypes with the specified schema. Returnsnull, in the case of an unparseable string.- Parameters:
- e- a string column containing JSON data.
- schema- the schema to use when parsing the json string
- options- options to control how the json is parsed. accepts the same options and the json data source. See Data Source Option in the version you use.
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
try_parse_jsonParses a JSON string and constructs a Variant value. Returns null if the input string is not a valid JSON value.- Parameters:
- json- a string column that contains JSON data.
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
parse_jsonParses a JSON string and constructs a Variant value.- Parameters:
- json- a string column that contains JSON data.
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
to_variant_objectConverts a column containing nested inputs (array/map/struct) into a variants where maps and structs are converted to variant objects which are unordered unlike SQL structs. Input maps can only have string keys.- Parameters:
- col- a column with a nested schema or column name.
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
is_variant_nullCheck if a variant value is a variant null. Returns true if and only if the input is a variant null and false otherwise (including in the case of SQL NULL).- Parameters:
- v- a variant column.
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
variant_getExtracts a sub-variant fromvaccording topathstring, and then cast the sub-variant totargetType. Returns null if the path does not exist. Throws an exception if the cast fails.- Parameters:
- v- a variant column.
- path- the extraction path. A valid path should start with- $and is followed by zero or more segments like- [123],- .name,- ['name'], or- ["name"].
- targetType- the target data type to cast into, in a DDL-formatted string.
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
variant_getExtracts a sub-variant fromvaccording topathcolumn, and then cast the sub-variant totargetType. Returns null if the path does not exist. Throws an exception if the cast fails.- Parameters:
- v- a variant column.
- path- the column containing the extraction path strings. A valid path string should start with- $and is followed by zero or more segments like- [123],- .name,- ['name'], or- ["name"].
- targetType- the target data type to cast into, in a DDL-formatted string.
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
try_variant_getExtracts a sub-variant fromvaccording topathstring, and then cast the sub-variant totargetType. Returns null if the path does not exist or the cast fails..- Parameters:
- v- a variant column.
- path- the extraction path. A valid path should start with- $and is followed by zero or more segments like- [123],- .name,- ['name'], or- ["name"].
- targetType- the target data type to cast into, in a DDL-formatted string.
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
try_variant_getExtracts a sub-variant fromvaccording topathcolumn, and then cast the sub-variant totargetType. Returns null if the path does not exist or the cast fails..- Parameters:
- v- a variant column.
- path- the column containing the extraction path strings. A valid path string should start with- $and is followed by zero or more segments like- [123],- .name,- ['name'], or- ["name"].
- targetType- the target data type to cast into, in a DDL-formatted string.
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
schema_of_variantReturns schema in the SQL format of a variant.- Parameters:
- v- a variant column.
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
schema_of_variant_aggReturns the merged schema in the SQL format of a variant column.- Parameters:
- v- a variant column.
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
schema_of_jsonParses a JSON string and infers its schema in DDL format.- Parameters:
- json- a JSON string.
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
schema_of_jsonParses a JSON string and infers its schema in DDL format.- Parameters:
- json- a foldable string column containing a JSON string.
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
schema_of_jsonParses a JSON string and infers its schema in DDL format using options.- Parameters:
- json- a foldable string column containing JSON data.
- options- options to control how the json is parsed. accepts the same options and the json data source. See Data Source Option in the version you use.
- Returns:
- a column with string literal containing schema in DDL format.
- Since:
- 3.0.0
 
- 
json_array_lengthReturns the number of elements in the outermost JSON array.NULLis returned in case of any other valid JSON string,NULLor an invalid JSON.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
json_object_keysReturns all the keys of the outermost JSON object as an array. If a valid JSON object is given, all the keys of the outermost object will be returned as an array. If it is any other valid JSON string, an invalid JSON string or an empty string, the function returns null.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
to_json(Scala-specific) Converts a column containing aStructType,ArrayTypeor aMapTypeinto a JSON string with the specified schema. Throws an exception, in the case of an unsupported type.- Parameters:
- e- a column containing a struct, an array or a map.
- options- options to control how the struct column is converted into a json string. accepts the same options and the json data source. See Data Source Option in the version you use. Additionally the function supports the- prettyoption which enables pretty JSON generation.
- Returns:
- (undocumented)
- Since:
- 2.1.0
 
- 
to_json(Java-specific) Converts a column containing aStructType,ArrayTypeor aMapTypeinto a JSON string with the specified schema. Throws an exception, in the case of an unsupported type.- Parameters:
- e- a column containing a struct, an array or a map.
- options- options to control how the struct column is converted into a json string. accepts the same options and the json data source. See Data Source Option in the version you use. Additionally the function supports the- prettyoption which enables pretty JSON generation.
- Returns:
- (undocumented)
- Since:
- 2.1.0
 
- 
to_jsonConverts a column containing aStructType,ArrayTypeor aMapTypeinto a JSON string with the specified schema. Throws an exception, in the case of an unsupported type.- Parameters:
- e- a column containing a struct, an array or a map.
- Returns:
- (undocumented)
- Since:
- 2.1.0
 
- 
maskMasks the given string value. The function replaces characters with 'X' or 'x', and numbers with 'n'. This can be useful for creating copies of tables with sensitive information removed.- Parameters:
- input- string value to mask. Supported types: STRING, VARCHAR, CHAR
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
maskMasks the given string value. The function replaces upper-case characters with specific character, lower-case characters with 'x', and numbers with 'n'. This can be useful for creating copies of tables with sensitive information removed.- Parameters:
- input- string value to mask. Supported types: STRING, VARCHAR, CHAR
- upperChar- character to replace upper-case characters with. Specify NULL to retain original character.
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
maskMasks the given string value. The function replaces upper-case and lower-case characters with the characters specified respectively, and numbers with 'n'. This can be useful for creating copies of tables with sensitive information removed.- Parameters:
- input- string value to mask. Supported types: STRING, VARCHAR, CHAR
- upperChar- character to replace upper-case characters with. Specify NULL to retain original character.
- lowerChar- character to replace lower-case characters with. Specify NULL to retain original character.
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
maskMasks the given string value. The function replaces upper-case, lower-case characters and numbers with the characters specified respectively. This can be useful for creating copies of tables with sensitive information removed.- Parameters:
- input- string value to mask. Supported types: STRING, VARCHAR, CHAR
- upperChar- character to replace upper-case characters with. Specify NULL to retain original character.
- lowerChar- character to replace lower-case characters with. Specify NULL to retain original character.
- digitChar- character to replace digit characters with. Specify NULL to retain original character.
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
maskpublic static Column mask(Column input, Column upperChar, Column lowerChar, Column digitChar, Column otherChar) Masks the given string value. This can be useful for creating copies of tables with sensitive information removed.- Parameters:
- input- string value to mask. Supported types: STRING, VARCHAR, CHAR
- upperChar- character to replace upper-case characters with. Specify NULL to retain original character.
- lowerChar- character to replace lower-case characters with. Specify NULL to retain original character.
- digitChar- character to replace digit characters with. Specify NULL to retain original character.
- otherChar- character to replace all other characters with. Specify NULL to retain original character.
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
sizeReturns length of array or map.This function returns -1 for null input only if spark.sql.ansi.enabled is false and spark.sql.legacy.sizeOfNull is true. Otherwise, it returns null for null input. With the default settings, the function returns null for null input. - Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
cardinalityReturns length of array or map. This is an alias ofsizefunction.This function returns -1 for null input only if spark.sql.ansi.enabled is false and spark.sql.legacy.sizeOfNull is true. Otherwise, it returns null for null input. With the default settings, the function returns null for null input. - Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
sort_arraySorts the input array for the given column in ascending order, according to the natural ordering of the array elements. Null elements will be placed at the beginning of the returned array.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
sort_arraySorts the input array for the given column in ascending or descending order, according to the natural ordering of the array elements. NaN is greater than any non-NaN elements for double/float type. Null elements will be placed at the beginning of the returned array in ascending order or at the end of the returned array in descending order.- Parameters:
- e- (undocumented)
- asc- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
array_minReturns the minimum value in the array. NaN is greater than any non-NaN elements for double/float type. NULL elements are skipped.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
array_maxReturns the maximum value in the array. NaN is greater than any non-NaN elements for double/float type. NULL elements are skipped.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
array_sizeReturns the total number of elements in the array. The function returns null for null input.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
array_aggAggregate function: returns a list of objects with duplicates.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
- Note:
- The function is non-deterministic because the order of collected results depends on the order of the rows which may be non-deterministic after a shuffle.
 
- 
shuffleReturns a random permutation of the given array.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
- Note:
- The function is non-deterministic.
 
- 
shuffleReturns a random permutation of the given array.- Parameters:
- e- (undocumented)
- seed- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
- Note:
- The function is non-deterministic.
 
- 
reverseReturns a reversed string or an array with reverse order of elements.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
flattenCreates a single array from an array of arrays. If a structure of nested arrays is deeper than two levels, only one level of nesting is removed.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
sequenceGenerate a sequence of integers from start to stop, incrementing by step.- Parameters:
- start- (undocumented)
- stop- (undocumented)
- step- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
sequenceGenerate a sequence of integers from start to stop, incrementing by 1 if start is less than or equal to stop, otherwise -1.- Parameters:
- start- (undocumented)
- stop- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
array_repeatCreates an array containing the left argument repeated the number of times given by the right argument.- Parameters:
- left- (undocumented)
- right- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
array_repeatCreates an array containing the left argument repeated the number of times given by the right argument.- Parameters:
- e- (undocumented)
- count- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
map_contains_keyReturns true if the map contains the key.- Parameters:
- column- (undocumented)
- key- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.3.0
 
- 
map_keysReturns an unordered array containing the keys of the map.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.3.0
 
- 
map_valuesReturns an unordered array containing the values of the map.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.3.0
 
- 
map_entriesReturns an unordered array of all entries in the given map.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
map_from_entriesReturns a map created from the given array of entries.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
arrays_zipReturns a merged array of structs in which the N-th struct contains all N-th values of input arrays.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
map_concatReturns the union of all the given maps.- Parameters:
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.4.0
 
- 
from_csvpublic static Column from_csv(Column e, StructType schema, scala.collection.immutable.Map<String, String> options) Parses a column containing a CSV string into aStructTypewith the specified schema. Returnsnull, in the case of an unparseable string.- Parameters:
- e- a string column containing CSV data.
- schema- the schema to use when parsing the CSV string
- options- options to control how the CSV is parsed. accepts the same options and the CSV data source. See Data Source Option in the version you use.
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
from_csv(Java-specific) Parses a column containing a CSV string into aStructTypewith the specified schema. Returnsnull, in the case of an unparseable string.- Parameters:
- e- a string column containing CSV data.
- schema- the schema to use when parsing the CSV string
- options- options to control how the CSV is parsed. accepts the same options and the CSV data source. See Data Source Option in the version you use.
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
schema_of_csvParses a CSV string and infers its schema in DDL format.- Parameters:
- csv- a CSV string.
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
schema_of_csvParses a CSV string and infers its schema in DDL format.- Parameters:
- csv- a foldable string column containing a CSV string.
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
schema_of_csvParses a CSV string and infers its schema in DDL format using options.- Parameters:
- csv- a foldable string column containing a CSV string.
- options- options to control how the CSV is parsed. accepts the same options and the CSV data source. See Data Source Option in the version you use.
- Returns:
- a column with string literal containing schema in DDL format.
- Since:
- 3.0.0
 
- 
to_csv(Java-specific) Converts a column containing aStructTypeinto a CSV string with the specified schema. Throws an exception, in the case of an unsupported type.- Parameters:
- e- a column containing a struct.
- options- options to control how the struct column is converted into a CSV string. It accepts the same options and the CSV data source. See Data Source Option in the version you use.
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
to_csvConverts a column containing aStructTypeinto a CSV string with the specified schema. Throws an exception, in the case of an unsupported type.- Parameters:
- e- a column containing a struct.
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
from_xmlParses a column containing a XML string into the data type corresponding to the specified schema. Returnsnull, in the case of an unparseable string.- Parameters:
- e- a string column containing XML data.
- schema- the schema to use when parsing the XML string
- options- options to control how the XML is parsed. accepts the same options and the XML data source. See Data Source Option in the version you use.
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
from_xml(Java-specific) Parses a column containing a XML string into aStructTypewith the specified schema. Returnsnull, in the case of an unparseable string.- Parameters:
- e- a string column containing XML data.
- schema- the schema as a DDL-formatted string.
- options- options to control how the XML is parsed. accepts the same options and the xml data source. See Data Source Option in the version you use.
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
from_xml(Java-specific) Parses a column containing a XML string into aStructTypewith the specified schema. Returnsnull, in the case of an unparseable string.- Parameters:
- e- a string column containing XML data.
- schema- the schema to use when parsing the XML string
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
from_xml(Java-specific) Parses a column containing a XML string into aStructTypewith the specified schema. Returnsnull, in the case of an unparseable string.- Parameters:
- e- a string column containing XML data.
- schema- the schema to use when parsing the XML string
- options- options to control how the XML is parsed. accepts the same options and the XML data source. See Data Source Option in the version you use.
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
from_xmlParses a column containing a XML string into the data type corresponding to the specified schema. Returnsnull, in the case of an unparseable string.- Parameters:
- e- a string column containing XML data.
- schema- the schema to use when parsing the XML string
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
schema_of_xmlParses a XML string and infers its schema in DDL format.- Parameters:
- xml- a XML string.
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
schema_of_xmlParses a XML string and infers its schema in DDL format.- Parameters:
- xml- a foldable string column containing a XML string.
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
schema_of_xmlParses a XML string and infers its schema in DDL format using options.- Parameters:
- xml- a foldable string column containing XML data.
- options- options to control how the xml is parsed. accepts the same options and the XML data source. See Data Source Option in the version you use.
- Returns:
- a column with string literal containing schema in DDL format.
- Since:
- 4.0.0
 
- 
to_xml(Java-specific) Converts a column containing aStructTypeinto a XML string with the specified schema. Throws an exception, in the case of an unsupported type.- Parameters:
- e- a column containing a struct.
- options- options to control how the struct column is converted into a XML string. It accepts the same options as the XML data source. See Data Source Option in the version you use.
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
to_xmlConverts a column containing aStructTypeinto a XML string with the specified schema. Throws an exception, in the case of an unsupported type.- Parameters:
- e- a column containing a struct.
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
years(Java-specific) A transform for timestamps and dates to partition data into years.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
months(Java-specific) A transform for timestamps and dates to partition data into months.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
days(Java-specific) A transform for timestamps and dates to partition data into days.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
xpathReturns a string array of values within the nodes of xml that match the XPath expression.- Parameters:
- xml- (undocumented)
- path- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
xpath_booleanReturns true if the XPath expression evaluates to true, or if a matching node is found.- Parameters:
- xml- (undocumented)
- path- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
xpath_doubleReturns a double value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric.- Parameters:
- xml- (undocumented)
- path- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
xpath_numberReturns a double value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric.- Parameters:
- xml- (undocumented)
- path- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
xpath_floatReturns a float value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric.- Parameters:
- xml- (undocumented)
- path- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
xpath_intReturns an integer value, or the value zero if no match is found, or a match is found but the value is non-numeric.- Parameters:
- xml- (undocumented)
- path- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
xpath_longReturns a long integer value, or the value zero if no match is found, or a match is found but the value is non-numeric.- Parameters:
- xml- (undocumented)
- path- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
xpath_shortReturns a short integer value, or the value zero if no match is found, or a match is found but the value is non-numeric.- Parameters:
- xml- (undocumented)
- path- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
xpath_stringReturns the text contents of the first xml node that matches the XPath expression.- Parameters:
- xml- (undocumented)
- path- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
hours(Java-specific) A transform for timestamps to partition data into hours.- Parameters:
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
convert_timezoneConverts the timestamp without time zonesourceTsfrom thesourceTztime zone totargetTz.- Parameters:
- sourceTz- the time zone for the input timestamp. If it is missed, the current session time zone is used as the source time zone.
- targetTz- the time zone to which the input timestamp should be converted.
- sourceTs- a timestamp without time zone.
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
convert_timezoneConverts the timestamp without time zonesourceTsfrom the current time zone totargetTz.- Parameters:
- targetTz- the time zone to which the input timestamp should be converted.
- sourceTs- a timestamp without time zone.
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
make_dt_intervalMake DayTimeIntervalType duration from days, hours, mins and secs.- Parameters:
- days- (undocumented)
- hours- (undocumented)
- mins- (undocumented)
- secs- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
make_dt_intervalMake DayTimeIntervalType duration from days, hours and mins.- Parameters:
- days- (undocumented)
- hours- (undocumented)
- mins- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
make_dt_intervalMake DayTimeIntervalType duration from days and hours.- Parameters:
- days- (undocumented)
- hours- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
make_dt_intervalMake DayTimeIntervalType duration from days.- Parameters:
- days- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
make_dt_intervalMake DayTimeIntervalType duration.- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
try_make_intervalpublic static Column try_make_interval(Column years, Column months, Column weeks, Column days, Column hours, Column mins, Column secs) This is a special version ofmake_intervalthat performs the same operation, but returns a NULL value instead of raising an error if interval cannot be created.- Parameters:
- years- (undocumented)
- months- (undocumented)
- weeks- (undocumented)
- days- (undocumented)
- hours- (undocumented)
- mins- (undocumented)
- secs- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
make_intervalpublic static Column make_interval(Column years, Column months, Column weeks, Column days, Column hours, Column mins, Column secs) Make interval from years, months, weeks, days, hours, mins and secs.- Parameters:
- years- (undocumented)
- months- (undocumented)
- weeks- (undocumented)
- days- (undocumented)
- hours- (undocumented)
- mins- (undocumented)
- secs- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
try_make_intervalpublic static Column try_make_interval(Column years, Column months, Column weeks, Column days, Column hours, Column mins) This is a special version ofmake_intervalthat performs the same operation, but returns a NULL value instead of raising an error if interval cannot be created.- Parameters:
- years- (undocumented)
- months- (undocumented)
- weeks- (undocumented)
- days- (undocumented)
- hours- (undocumented)
- mins- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
make_intervalpublic static Column make_interval(Column years, Column months, Column weeks, Column days, Column hours, Column mins) Make interval from years, months, weeks, days, hours and mins.- Parameters:
- years- (undocumented)
- months- (undocumented)
- weeks- (undocumented)
- days- (undocumented)
- hours- (undocumented)
- mins- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
try_make_intervalpublic static Column try_make_interval(Column years, Column months, Column weeks, Column days, Column hours) This is a special version ofmake_intervalthat performs the same operation, but returns a NULL value instead of raising an error if interval cannot be created.- Parameters:
- years- (undocumented)
- months- (undocumented)
- weeks- (undocumented)
- days- (undocumented)
- hours- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
make_intervalpublic static Column make_interval(Column years, Column months, Column weeks, Column days, Column hours) Make interval from years, months, weeks, days and hours.- Parameters:
- years- (undocumented)
- months- (undocumented)
- weeks- (undocumented)
- days- (undocumented)
- hours- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
try_make_intervalThis is a special version ofmake_intervalthat performs the same operation, but returns a NULL value instead of raising an error if interval cannot be created.- Parameters:
- years- (undocumented)
- months- (undocumented)
- weeks- (undocumented)
- days- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
make_intervalMake interval from years, months, weeks and days.- Parameters:
- years- (undocumented)
- months- (undocumented)
- weeks- (undocumented)
- days- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
try_make_intervalThis is a special version ofmake_intervalthat performs the same operation, but returns a NULL value instead of raising an error if interval cannot be created.- Parameters:
- years- (undocumented)
- months- (undocumented)
- weeks- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
make_intervalMake interval from years, months and weeks.- Parameters:
- years- (undocumented)
- months- (undocumented)
- weeks- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
try_make_intervalThis is a special version ofmake_intervalthat performs the same operation, but returns a NULL value instead of raising an error if interval cannot be created.- Parameters:
- years- (undocumented)
- months- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
make_intervalMake interval from years and months.- Parameters:
- years- (undocumented)
- months- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
try_make_intervalThis is a special version ofmake_intervalthat performs the same operation, but returns a NULL value instead of raising an error if interval cannot be created.- Parameters:
- years- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
make_intervalMake interval from years.- Parameters:
- years- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
make_intervalMake interval.- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
make_timestamppublic static Column make_timestamp(Column years, Column months, Column days, Column hours, Column mins, Column secs, Column timezone) Create timestamp from years, months, days, hours, mins, secs and timezone fields. The result data type is consistent with the value of configurationspark.sql.timestampType. If the configurationspark.sql.ansi.enabledis false, the function returns NULL on invalid inputs. Otherwise, it will throw an error instead.- Parameters:
- years- (undocumented)
- months- (undocumented)
- days- (undocumented)
- hours- (undocumented)
- mins- (undocumented)
- secs- (undocumented)
- timezone- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
make_timestamppublic static Column make_timestamp(Column years, Column months, Column days, Column hours, Column mins, Column secs) Create timestamp from years, months, days, hours, mins and secs fields. The result data type is consistent with the value of configurationspark.sql.timestampType. If the configurationspark.sql.ansi.enabledis false, the function returns NULL on invalid inputs. Otherwise, it will throw an error instead.- Parameters:
- years- (undocumented)
- months- (undocumented)
- days- (undocumented)
- hours- (undocumented)
- mins- (undocumented)
- secs- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
make_timestampCreate a local date-time from date, time, and timezone fields.- Parameters:
- date- (undocumented)
- time- (undocumented)
- timezone- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
make_timestampCreate a local date-time from date and time fields.- Parameters:
- date- (undocumented)
- time- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
try_make_timestamppublic static Column try_make_timestamp(Column years, Column months, Column days, Column hours, Column mins, Column secs, Column timezone) Try to create a timestamp from years, months, days, hours, mins, secs and timezone fields. The result data type is consistent with the value of configurationspark.sql.timestampType. The function returns NULL on invalid inputs.- Parameters:
- years- (undocumented)
- months- (undocumented)
- days- (undocumented)
- hours- (undocumented)
- mins- (undocumented)
- secs- (undocumented)
- timezone- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
try_make_timestamppublic static Column try_make_timestamp(Column years, Column months, Column days, Column hours, Column mins, Column secs) Try to create a timestamp from years, months, days, hours, mins, and secs fields. The result data type is consistent with the value of configurationspark.sql.timestampType. The function returns NULL on invalid inputs.- Parameters:
- years- (undocumented)
- months- (undocumented)
- days- (undocumented)
- hours- (undocumented)
- mins- (undocumented)
- secs- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
try_make_timestampTry to create a local date-time from date, time, and timezone fields.- Parameters:
- date- (undocumented)
- time- (undocumented)
- timezone- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
try_make_timestampTry to create a local date-time from date and time fields.- Parameters:
- date- (undocumented)
- time- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
make_timestamp_ltzpublic static Column make_timestamp_ltz(Column years, Column months, Column days, Column hours, Column mins, Column secs, Column timezone) Create the current timestamp with local time zone from years, months, days, hours, mins, secs and timezone fields. If the configurationspark.sql.ansi.enabledis false, the function returns NULL on invalid inputs. Otherwise, it will throw an error instead.- Parameters:
- years- (undocumented)
- months- (undocumented)
- days- (undocumented)
- hours- (undocumented)
- mins- (undocumented)
- secs- (undocumented)
- timezone- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
make_timestamp_ltzpublic static Column make_timestamp_ltz(Column years, Column months, Column days, Column hours, Column mins, Column secs) Create the current timestamp with local time zone from years, months, days, hours, mins and secs fields. If the configurationspark.sql.ansi.enabledis false, the function returns NULL on invalid inputs. Otherwise, it will throw an error instead.- Parameters:
- years- (undocumented)
- months- (undocumented)
- days- (undocumented)
- hours- (undocumented)
- mins- (undocumented)
- secs- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
try_make_timestamp_ltzpublic static Column try_make_timestamp_ltz(Column years, Column months, Column days, Column hours, Column mins, Column secs, Column timezone) Try to create the current timestamp with local time zone from years, months, days, hours, mins, secs and timezone fields. The function returns NULL on invalid inputs.- Parameters:
- years- (undocumented)
- months- (undocumented)
- days- (undocumented)
- hours- (undocumented)
- mins- (undocumented)
- secs- (undocumented)
- timezone- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
try_make_timestamp_ltzpublic static Column try_make_timestamp_ltz(Column years, Column months, Column days, Column hours, Column mins, Column secs) Try to create the current timestamp with local time zone from years, months, days, hours, mins and secs fields. The function returns NULL on invalid inputs.- Parameters:
- years- (undocumented)
- months- (undocumented)
- days- (undocumented)
- hours- (undocumented)
- mins- (undocumented)
- secs- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
make_timestamp_ntzpublic static Column make_timestamp_ntz(Column years, Column months, Column days, Column hours, Column mins, Column secs) Create local date-time from years, months, days, hours, mins, secs fields. If the configurationspark.sql.ansi.enabledis false, the function returns NULL on invalid inputs. Otherwise, it will throw an error instead.- Parameters:
- years- (undocumented)
- months- (undocumented)
- days- (undocumented)
- hours- (undocumented)
- mins- (undocumented)
- secs- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
make_timestamp_ntzCreate a local date-time from date and time fields.- Parameters:
- date- (undocumented)
- time- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
try_make_timestamp_ntzpublic static Column try_make_timestamp_ntz(Column years, Column months, Column days, Column hours, Column mins, Column secs) Try to create a local date-time from years, months, days, hours, mins, secs fields. The function returns NULL on invalid inputs.- Parameters:
- years- (undocumented)
- months- (undocumented)
- days- (undocumented)
- hours- (undocumented)
- mins- (undocumented)
- secs- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
try_make_timestamp_ntzTry to create a local date-time from date and time fields.- Parameters:
- date- (undocumented)
- time- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.1.0
 
- 
make_ym_intervalMake year-month interval from years, months.- Parameters:
- years- (undocumented)
- months- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
make_ym_intervalMake year-month interval from years.- Parameters:
- years- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
make_ym_intervalMake year-month interval.- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
bucket(Java-specific) A transform for any type that partitions by a hash of the input column.- Parameters:
- numBuckets- (undocumented)
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
bucket(Java-specific) A transform for any type that partitions by a hash of the input column.- Parameters:
- numBuckets- (undocumented)
- e- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.0.0
 
- 
ifnullReturnscol2ifcol1is null, orcol1otherwise.- Parameters:
- col1- (undocumented)
- col2- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
isnotnullReturns true ifcolis not null, or false otherwise.- Parameters:
- col- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
equal_nullReturns same result as the EQUAL(=) operator for non-null operands, but returns true if both are null, false if one of the them is null.- Parameters:
- col1- (undocumented)
- col2- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
nullifReturns null ifcol1equals tocol2, orcol1otherwise.- Parameters:
- col1- (undocumented)
- col2- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
nullifzeroReturns null ifcolis equal to zero, orcolotherwise.- Parameters:
- col- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
nvlReturnscol2ifcol1is null, orcol1otherwise.- Parameters:
- col1- (undocumented)
- col2- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
nvl2Returnscol2ifcol1is not null, orcol3otherwise.- Parameters:
- col1- (undocumented)
- col2- (undocumented)
- col3- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
zeroifnullReturns zero ifcolis null, orcolotherwise.- Parameters:
- col- (undocumented)
- Returns:
- (undocumented)
- Since:
- 4.0.0
 
- 
udafpublic static <IN,BUF, UserDefinedFunction udafOUT> (Aggregator<IN, BUF, OUT> agg, scala.reflect.api.TypeTags.TypeTag<IN> evidence$3) Obtains aUserDefinedFunctionthat wraps the givenAggregatorso that it may be used with untyped Data Frames.val agg = // Aggregator[IN, BUF, OUT] // declare a UDF based on agg val aggUDF = udaf(agg) val aggData = df.agg(aggUDF($"colname")) // register agg as a named function spark.udf.register("myAggName", udaf(agg))- Parameters:
- agg- the typed Aggregator
- evidence$3- (undocumented)
- Returns:
- a UserDefinedFunction that can be used as an aggregating expression.
- Note:
- The input encoder is inferred from the input type IN.
 
- 
udafpublic static <IN,BUF, UserDefinedFunction udafOUT> (Aggregator<IN, BUF, OUT> agg, Encoder<IN> inputEncoder) Obtains aUserDefinedFunctionthat wraps the givenAggregatorso that it may be used with untyped Data Frames.Aggregator<IN, BUF, OUT> agg = // custom Aggregator Encoder<IN> enc = // input encoder // declare a UDF based on agg UserDefinedFunction aggUDF = udaf(agg, enc) DataFrame aggData = df.agg(aggUDF($"colname")) // register agg as a named function spark.udf.register("myAggName", udaf(agg, enc))- Parameters:
- agg- the typed Aggregator
- inputEncoder- a specific input encoder to use
- Returns:
- a UserDefinedFunction that can be used as an aggregating expression
- Note:
- This overloading takes an explicit input encoder, to support UDAF declarations in Java.
 
- 
udfpublic static <RT> UserDefinedFunction udf(scala.Function0<RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$4) Defines a Scala closure of 0 arguments as user-defined function (UDF). The data types are automatically inferred based on the Scala closure's signature. By default the returned UDF is deterministic. To change it to nondeterministic, call the APIUserDefinedFunction.asNondeterministic().- Parameters:
- f- (undocumented)
- evidence$4- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
udfpublic static <RT,A1> UserDefinedFunction udf(scala.Function1<A1, RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$5, scala.reflect.api.TypeTags.TypeTag<A1> evidence$6) Defines a Scala closure of 1 arguments as user-defined function (UDF). The data types are automatically inferred based on the Scala closure's signature. By default the returned UDF is deterministic. To change it to nondeterministic, call the APIUserDefinedFunction.asNondeterministic().- Parameters:
- f- (undocumented)
- evidence$5- (undocumented)
- evidence$6- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
udfpublic static <RT,A1, UserDefinedFunction udfA2> (scala.Function2<A1, A2, RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$7, scala.reflect.api.TypeTags.TypeTag<A1> evidence$8, scala.reflect.api.TypeTags.TypeTag<A2> evidence$9) Defines a Scala closure of 2 arguments as user-defined function (UDF). The data types are automatically inferred based on the Scala closure's signature. By default the returned UDF is deterministic. To change it to nondeterministic, call the APIUserDefinedFunction.asNondeterministic().- Parameters:
- f- (undocumented)
- evidence$7- (undocumented)
- evidence$8- (undocumented)
- evidence$9- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
udfpublic static <RT,A1, UserDefinedFunction udfA2, A3> (scala.Function3<A1, A2, A3, RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$10, scala.reflect.api.TypeTags.TypeTag<A1> evidence$11, scala.reflect.api.TypeTags.TypeTag<A2> evidence$12, scala.reflect.api.TypeTags.TypeTag<A3> evidence$13) Defines a Scala closure of 3 arguments as user-defined function (UDF). The data types are automatically inferred based on the Scala closure's signature. By default the returned UDF is deterministic. To change it to nondeterministic, call the APIUserDefinedFunction.asNondeterministic().- Parameters:
- f- (undocumented)
- evidence$10- (undocumented)
- evidence$11- (undocumented)
- evidence$12- (undocumented)
- evidence$13- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
udfpublic static <RT,A1, UserDefinedFunction udfA2, A3, A4> (scala.Function4<A1, A2, A3, A4, RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$14, scala.reflect.api.TypeTags.TypeTag<A1> evidence$15, scala.reflect.api.TypeTags.TypeTag<A2> evidence$16, scala.reflect.api.TypeTags.TypeTag<A3> evidence$17, scala.reflect.api.TypeTags.TypeTag<A4> evidence$18) Defines a Scala closure of 4 arguments as user-defined function (UDF). The data types are automatically inferred based on the Scala closure's signature. By default the returned UDF is deterministic. To change it to nondeterministic, call the APIUserDefinedFunction.asNondeterministic().- Parameters:
- f- (undocumented)
- evidence$14- (undocumented)
- evidence$15- (undocumented)
- evidence$16- (undocumented)
- evidence$17- (undocumented)
- evidence$18- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
udfpublic static <RT,A1, UserDefinedFunction udfA2, A3, A4, A5> (scala.Function5<A1, A2, A3, A4, A5, RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$19, scala.reflect.api.TypeTags.TypeTag<A1> evidence$20, scala.reflect.api.TypeTags.TypeTag<A2> evidence$21, scala.reflect.api.TypeTags.TypeTag<A3> evidence$22, scala.reflect.api.TypeTags.TypeTag<A4> evidence$23, scala.reflect.api.TypeTags.TypeTag<A5> evidence$24) Defines a Scala closure of 5 arguments as user-defined function (UDF). The data types are automatically inferred based on the Scala closure's signature. By default the returned UDF is deterministic. To change it to nondeterministic, call the APIUserDefinedFunction.asNondeterministic().- Parameters:
- f- (undocumented)
- evidence$19- (undocumented)
- evidence$20- (undocumented)
- evidence$21- (undocumented)
- evidence$22- (undocumented)
- evidence$23- (undocumented)
- evidence$24- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
udfpublic static <RT,A1, UserDefinedFunction udfA2, A3, A4, A5, A6> (scala.Function6<A1, A2, A3, A4, A5, A6, RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$25, scala.reflect.api.TypeTags.TypeTag<A1> evidence$26, scala.reflect.api.TypeTags.TypeTag<A2> evidence$27, scala.reflect.api.TypeTags.TypeTag<A3> evidence$28, scala.reflect.api.TypeTags.TypeTag<A4> evidence$29, scala.reflect.api.TypeTags.TypeTag<A5> evidence$30, scala.reflect.api.TypeTags.TypeTag<A6> evidence$31) Defines a Scala closure of 6 arguments as user-defined function (UDF). The data types are automatically inferred based on the Scala closure's signature. By default the returned UDF is deterministic. To change it to nondeterministic, call the APIUserDefinedFunction.asNondeterministic().- Parameters:
- f- (undocumented)
- evidence$25- (undocumented)
- evidence$26- (undocumented)
- evidence$27- (undocumented)
- evidence$28- (undocumented)
- evidence$29- (undocumented)
- evidence$30- (undocumented)
- evidence$31- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
udfpublic static <RT,A1, UserDefinedFunction udfA2, A3, A4, A5, A6, A7> (scala.Function7<A1, A2, A3, A4, A5, A6, A7, RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$32, scala.reflect.api.TypeTags.TypeTag<A1> evidence$33, scala.reflect.api.TypeTags.TypeTag<A2> evidence$34, scala.reflect.api.TypeTags.TypeTag<A3> evidence$35, scala.reflect.api.TypeTags.TypeTag<A4> evidence$36, scala.reflect.api.TypeTags.TypeTag<A5> evidence$37, scala.reflect.api.TypeTags.TypeTag<A6> evidence$38, scala.reflect.api.TypeTags.TypeTag<A7> evidence$39) Defines a Scala closure of 7 arguments as user-defined function (UDF). The data types are automatically inferred based on the Scala closure's signature. By default the returned UDF is deterministic. To change it to nondeterministic, call the APIUserDefinedFunction.asNondeterministic().- Parameters:
- f- (undocumented)
- evidence$32- (undocumented)
- evidence$33- (undocumented)
- evidence$34- (undocumented)
- evidence$35- (undocumented)
- evidence$36- (undocumented)
- evidence$37- (undocumented)
- evidence$38- (undocumented)
- evidence$39- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
udfpublic static <RT,A1, UserDefinedFunction udfA2, A3, A4, A5, A6, A7, A8> (scala.Function8<A1, A2, A3, A4, A5, A6, A7, A8, RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$40, scala.reflect.api.TypeTags.TypeTag<A1> evidence$41, scala.reflect.api.TypeTags.TypeTag<A2> evidence$42, scala.reflect.api.TypeTags.TypeTag<A3> evidence$43, scala.reflect.api.TypeTags.TypeTag<A4> evidence$44, scala.reflect.api.TypeTags.TypeTag<A5> evidence$45, scala.reflect.api.TypeTags.TypeTag<A6> evidence$46, scala.reflect.api.TypeTags.TypeTag<A7> evidence$47, scala.reflect.api.TypeTags.TypeTag<A8> evidence$48) Defines a Scala closure of 8 arguments as user-defined function (UDF). The data types are automatically inferred based on the Scala closure's signature. By default the returned UDF is deterministic. To change it to nondeterministic, call the APIUserDefinedFunction.asNondeterministic().- Parameters:
- f- (undocumented)
- evidence$40- (undocumented)
- evidence$41- (undocumented)
- evidence$42- (undocumented)
- evidence$43- (undocumented)
- evidence$44- (undocumented)
- evidence$45- (undocumented)
- evidence$46- (undocumented)
- evidence$47- (undocumented)
- evidence$48- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
udfpublic static <RT,A1, UserDefinedFunction udfA2, A3, A4, A5, A6, A7, A8, A9> (scala.Function9<A1, A2, A3, A4, A5, A6, A7, A8, A9, RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$49, scala.reflect.api.TypeTags.TypeTag<A1> evidence$50, scala.reflect.api.TypeTags.TypeTag<A2> evidence$51, scala.reflect.api.TypeTags.TypeTag<A3> evidence$52, scala.reflect.api.TypeTags.TypeTag<A4> evidence$53, scala.reflect.api.TypeTags.TypeTag<A5> evidence$54, scala.reflect.api.TypeTags.TypeTag<A6> evidence$55, scala.reflect.api.TypeTags.TypeTag<A7> evidence$56, scala.reflect.api.TypeTags.TypeTag<A8> evidence$57, scala.reflect.api.TypeTags.TypeTag<A9> evidence$58) Defines a Scala closure of 9 arguments as user-defined function (UDF). The data types are automatically inferred based on the Scala closure's signature. By default the returned UDF is deterministic. To change it to nondeterministic, call the APIUserDefinedFunction.asNondeterministic().- Parameters:
- f- (undocumented)
- evidence$49- (undocumented)
- evidence$50- (undocumented)
- evidence$51- (undocumented)
- evidence$52- (undocumented)
- evidence$53- (undocumented)
- evidence$54- (undocumented)
- evidence$55- (undocumented)
- evidence$56- (undocumented)
- evidence$57- (undocumented)
- evidence$58- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
udfpublic static <RT,A1, UserDefinedFunction udfA2, A3, A4, A5, A6, A7, A8, A9, A10> (scala.Function10<A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, RT> f, scala.reflect.api.TypeTags.TypeTag<RT> evidence$59, scala.reflect.api.TypeTags.TypeTag<A1> evidence$60, scala.reflect.api.TypeTags.TypeTag<A2> evidence$61, scala.reflect.api.TypeTags.TypeTag<A3> evidence$62, scala.reflect.api.TypeTags.TypeTag<A4> evidence$63, scala.reflect.api.TypeTags.TypeTag<A5> evidence$64, scala.reflect.api.TypeTags.TypeTag<A6> evidence$65, scala.reflect.api.TypeTags.TypeTag<A7> evidence$66, scala.reflect.api.TypeTags.TypeTag<A8> evidence$67, scala.reflect.api.TypeTags.TypeTag<A9> evidence$68, scala.reflect.api.TypeTags.TypeTag<A10> evidence$69) Defines a Scala closure of 10 arguments as user-defined function (UDF). The data types are automatically inferred based on the Scala closure's signature. By default the returned UDF is deterministic. To change it to nondeterministic, call the APIUserDefinedFunction.asNondeterministic().- Parameters:
- f- (undocumented)
- evidence$59- (undocumented)
- evidence$60- (undocumented)
- evidence$61- (undocumented)
- evidence$62- (undocumented)
- evidence$63- (undocumented)
- evidence$64- (undocumented)
- evidence$65- (undocumented)
- evidence$66- (undocumented)
- evidence$67- (undocumented)
- evidence$68- (undocumented)
- evidence$69- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.3.0
 
- 
udfDefines a Java UDF0 instance as user-defined function (UDF). The caller must specify the output data type, and there is no automatic input type coercion. By default the returned UDF is deterministic. To change it to nondeterministic, call the APIUserDefinedFunction.asNondeterministic().- Parameters:
- f- (undocumented)
- returnType- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.3.0
 
- 
udfDefines a Java UDF1 instance as user-defined function (UDF). The caller must specify the output data type, and there is no automatic input type coercion. By default the returned UDF is deterministic. To change it to nondeterministic, call the APIUserDefinedFunction.asNondeterministic().- Parameters:
- f- (undocumented)
- returnType- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.3.0
 
- 
udfDefines a Java UDF2 instance as user-defined function (UDF). The caller must specify the output data type, and there is no automatic input type coercion. By default the returned UDF is deterministic. To change it to nondeterministic, call the APIUserDefinedFunction.asNondeterministic().- Parameters:
- f- (undocumented)
- returnType- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.3.0
 
- 
udfDefines a Java UDF3 instance as user-defined function (UDF). The caller must specify the output data type, and there is no automatic input type coercion. By default the returned UDF is deterministic. To change it to nondeterministic, call the APIUserDefinedFunction.asNondeterministic().- Parameters:
- f- (undocumented)
- returnType- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.3.0
 
- 
udfDefines a Java UDF4 instance as user-defined function (UDF). The caller must specify the output data type, and there is no automatic input type coercion. By default the returned UDF is deterministic. To change it to nondeterministic, call the APIUserDefinedFunction.asNondeterministic().- Parameters:
- f- (undocumented)
- returnType- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.3.0
 
- 
udfDefines a Java UDF5 instance as user-defined function (UDF). The caller must specify the output data type, and there is no automatic input type coercion. By default the returned UDF is deterministic. To change it to nondeterministic, call the APIUserDefinedFunction.asNondeterministic().- Parameters:
- f- (undocumented)
- returnType- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.3.0
 
- 
udfDefines a Java UDF6 instance as user-defined function (UDF). The caller must specify the output data type, and there is no automatic input type coercion. By default the returned UDF is deterministic. To change it to nondeterministic, call the APIUserDefinedFunction.asNondeterministic().- Parameters:
- f- (undocumented)
- returnType- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.3.0
 
- 
udfDefines a Java UDF7 instance as user-defined function (UDF). The caller must specify the output data type, and there is no automatic input type coercion. By default the returned UDF is deterministic. To change it to nondeterministic, call the APIUserDefinedFunction.asNondeterministic().- Parameters:
- f- (undocumented)
- returnType- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.3.0
 
- 
udfDefines a Java UDF8 instance as user-defined function (UDF). The caller must specify the output data type, and there is no automatic input type coercion. By default the returned UDF is deterministic. To change it to nondeterministic, call the APIUserDefinedFunction.asNondeterministic().- Parameters:
- f- (undocumented)
- returnType- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.3.0
 
- 
udfDefines a Java UDF9 instance as user-defined function (UDF). The caller must specify the output data type, and there is no automatic input type coercion. By default the returned UDF is deterministic. To change it to nondeterministic, call the APIUserDefinedFunction.asNondeterministic().- Parameters:
- f- (undocumented)
- returnType- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.3.0
 
- 
udfDefines a Java UDF10 instance as user-defined function (UDF). The caller must specify the output data type, and there is no automatic input type coercion. By default the returned UDF is deterministic. To change it to nondeterministic, call the APIUserDefinedFunction.asNondeterministic().- Parameters:
- f- (undocumented)
- returnType- (undocumented)
- Returns:
- (undocumented)
- Since:
- 2.3.0
 
- 
udfDeprecated.Scala `udf` method with return type parameter is deprecated. Please use Scala `udf` method without return type parameter. Since 3.0.0.Defines a deterministic user-defined function (UDF) using a Scala closure. For this variant, the caller must specify the output data type, and there is no automatic input type coercion. By default the returned UDF is deterministic. To change it to nondeterministic, call the APIUserDefinedFunction.asNondeterministic().Note that, although the Scala closure can have primitive-type function argument, it doesn't work well with null values. Because the Scala closure is passed in as Any type, there is no type information for the function arguments. Without the type information, Spark may blindly pass null to the Scala closure with primitive-type argument, and the closure will see the default value of the Java type for the null argument, e.g. udf((x: Int) => x, IntegerType), the result is 0 for null input.- Parameters:
- f- A closure in Scala
- dataType- The output data type of the UDF
- Returns:
- (undocumented)
- Since:
- 2.0.0
 
- 
callUDFDeprecated.Use call_udf.Call an user-defined function.- Parameters:
- udfName- (undocumented)
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 1.5.0
 
- 
call_udfCall an user-defined function. Example:import org.apache.spark.sql._ val df = Seq(("id1", 1), ("id2", 4), ("id3", 5)).toDF("id", "value") val spark = df.sparkSession spark.udf.register("simpleUDF", (v: Int) => v * v) df.select($"id", call_udf("simpleUDF", $"value"))- Parameters:
- udfName- (undocumented)
- cols- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.2.0
 
- 
call_functionCall a SQL function.- Parameters:
- funcName- function name that follows the SQL identifier syntax (can be quoted, can be qualified)
- cols- the expression parameters of function
- Returns:
- (undocumented)
- Since:
- 3.5.0
 
- 
unwrap_udtUnwrap UDT data type column into its underlying type.- Parameters:
- column- (undocumented)
- Returns:
- (undocumented)
- Since:
- 3.4.0
 
 
-