pyspark.sql.functions.rand#
- pyspark.sql.functions.rand(seed=None)[source]#
Generates a random column with independent and identically distributed (i.i.d.) samples uniformly distributed in [0.0, 1.0).
New in version 1.4.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- seedint, optional
Seed value for the random generator.
- Returns
Column
A column of random values.
Notes
The function is non-deterministic in general case.
Examples
Example 1: Generate a random column without a seed
>>> from pyspark.sql import functions as sf >>> spark.range(0, 2, 1, 1).withColumn('rand', sf.rand()).show() +---+-------------------+ | id| rand| +---+-------------------+ | 0|0.14879325244215424| | 1| 0.4640631044275454| +---+-------------------+
Example 2: Generate a random column with a specific seed
>>> spark.range(0, 2, 1, 1).withColumn('rand', sf.rand(seed=42) * 3).show() +---+------------------+ | id| rand| +---+------------------+ | 0|1.8575681106759028| | 1|1.5288056527339444| +---+------------------+