pyspark.sql.functions.hash#
- pyspark.sql.functions.hash(*cols)[source]#
- Calculates the hash code of given columns, and returns the result as an int column. - New in version 2.0.0. - Changed in version 3.4.0: Supports Spark Connect. - Parameters
- colsColumnor column name
- one or more columns to compute on. 
 
- cols
- Returns
- Column
- hash value as int column. 
 
 - See also - Examples - >>> import pyspark.sql.functions as sf >>> df = spark.createDataFrame([('ABC', 'DEF')], ['c1', 'c2']) >>> df.select('*', sf.hash('c1')).show() +---+---+----------+ | c1| c2| hash(c1)| +---+---+----------+ |ABC|DEF|-757602832| +---+---+----------+ - >>> df.select('*', sf.hash('c1', df.c2)).show() +---+---+------------+ | c1| c2|hash(c1, c2)| +---+---+------------+ |ABC|DEF| 599895104| +---+---+------------+ - >>> df.select('*', sf.hash('*')).show() +---+---+------------+ | c1| c2|hash(c1, c2)| +---+---+------------+ |ABC|DEF| 599895104| +---+---+------------+