pyspark.sql.functions.levenshtein¶
-
pyspark.sql.functions.
levenshtein
(left: ColumnOrName, right: ColumnOrName, threshold: Optional[int] = None) → pyspark.sql.column.Column[source]¶ Computes the Levenshtein distance of the two given strings.
New in version 1.5.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- Returns
Column
Levenshtein distance as integer value.
Examples
>>> df0 = spark.createDataFrame([('kitten', 'sitting',)], ['l', 'r']) >>> df0.select(levenshtein('l', 'r').alias('d')).collect() [Row(d=3)] >>> df0.select(levenshtein('l', 'r', 2).alias('d')).collect() [Row(d=-1)]