pyspark.sql.functions.array_position#
- pyspark.sql.functions.array_position(col, value)[source]#
- Array function: Locates the position of the first occurrence of the given value in the given array. Returns null if either of the arguments are null. - New in version 2.4.0. - Changed in version 3.4.0: Supports Spark Connect. - Parameters
- Returns
- Column
- position of the value in the given array if found and 0 otherwise. 
 
 - Notes - The position is not zero based, but 1 based index. Returns 0 if the given value could not be found in the array. - Examples - Example 1: Finding the position of a string in an array of strings - >>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([(["c", "b", "a"],)], ['data']) >>> df.select(sf.array_position(df.data, "a")).show() +-----------------------+ |array_position(data, a)| +-----------------------+ | 3| +-----------------------+ - Example 2: Finding the position of a string in an empty array - >>> from pyspark.sql import functions as sf >>> from pyspark.sql.types import ArrayType, StringType, StructField, StructType >>> schema = StructType([StructField("data", ArrayType(StringType()), True)]) >>> df = spark.createDataFrame([([],)], schema=schema) >>> df.select(sf.array_position(df.data, "a")).show() +-----------------------+ |array_position(data, a)| +-----------------------+ | 0| +-----------------------+ - Example 3: Finding the position of an integer in an array of integers - >>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([([1, 2, 3],)], ['data']) >>> df.select(sf.array_position(df.data, 2)).show() +-----------------------+ |array_position(data, 2)| +-----------------------+ | 2| +-----------------------+ - Example 4: Finding the position of a non-existing value in an array - >>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([(["c", "b", "a"],)], ['data']) >>> df.select(sf.array_position(df.data, "d")).show() +-----------------------+ |array_position(data, d)| +-----------------------+ | 0| +-----------------------+ - Example 5: Finding the position of a value in an array with nulls - >>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([([None, "b", "a"],)], ['data']) >>> df.select(sf.array_position(df.data, "a")).show() +-----------------------+ |array_position(data, a)| +-----------------------+ | 3| +-----------------------+ - Example 6: Finding the position of a column’s value in an array of integers - >>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([([10, 20, 30], 20)], ['data', 'col']) >>> df.select(sf.array_position(df.data, df.col)).show() +-------------------------+ |array_position(data, col)| +-------------------------+ | 2| +-------------------------+