pyspark.sql.functions.months_between

pyspark.sql.functions.months_between(date1: ColumnOrName, date2: ColumnOrName, roundOff: bool = True) → pyspark.sql.column.Column[source]

Returns number of months between dates date1 and date2. If date1 is later than date2, then the result is positive. A whole number is returned if both inputs have the same day of month or both are the last day of their respective months. Otherwise, the difference is calculated assuming 31 days per month. The result is rounded off to 8 digits unless roundOff is set to False.

New in version 1.5.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
date1Column or str

first date column.

date2Column or str

second date column.

roundOffbool, optional

whether to round (to 8 digits) the final value or not (default: True).

Returns
Column

number of months between two dates.

Examples

>>> df = spark.createDataFrame([('1997-02-28 10:30:00', '1996-10-30')], ['date1', 'date2'])
>>> df.select(months_between(df.date1, df.date2).alias('months')).collect()
[Row(months=3.94959677)]
>>> df.select(months_between(df.date1, df.date2, False).alias('months')).collect()
[Row(months=3.9495967741935485)]