pyspark.sql.functions.regexp_extract(str, pattern, idx)[source]

Extract a specific group matched by a Java regex, from the specified string column. If the regex did not match, or the specified group did not match, an empty string is returned.

New in version 1.5.0.


>>> df = spark.createDataFrame([('100-200',)], ['str'])
>>>'str', r'(\d+)-(\d+)', 1).alias('d')).collect()
>>> df = spark.createDataFrame([('foo',)], ['str'])
>>>'str', r'(\d+)', 1).alias('d')).collect()
>>> df = spark.createDataFrame([('aaaac',)], ['str'])
>>>'str', '(a+)(b)?(c)', 2).alias('d')).collect()