pyspark.sql.functions.make_timestamp#

pyspark.sql.functions.make_timestamp(years=None, months=None, days=None, hours=None, mins=None, secs=None, timezone=None, date=None, time=None)[source]#

Create timestamp from years, months, days, hours, mins, secs, and (optional) timezone fields. Alternatively, create timestamp from date, time, and (optional) timezone fields. The result data type is consistent with the value of configuration spark.sql.timestampType. If the configuration spark.sql.ansi.enabled is false, the function returns NULL on invalid inputs. Otherwise, it will throw an error instead.

New in version 3.5.0.

Changed in version 4.1.0: Added support for creating timestamps from date and time.

Parameters
yearsColumn or column name, optional

The year to represent, from 1 to 9999. Required when creating timestamps from individual components. Must be used with months, days, hours, mins, and secs.

monthsColumn or column name, optional

The month-of-year to represent, from 1 (January) to 12 (December). Required when creating timestamps from individual components. Must be used with years, days, hours, mins, and secs.

daysColumn or column name, optional

The day-of-month to represent, from 1 to 31. Required when creating timestamps from individual components. Must be used with years, months, hours, mins, and secs.

hoursColumn or column name, optional

The hour-of-day to represent, from 0 to 23. Required when creating timestamps from individual components. Must be used with years, months, days, mins, and secs.

minsColumn or column name, optional

The minute-of-hour to represent, from 0 to 59. Required when creating timestamps from individual components. Must be used with years, months, days, hours, and secs.

secsColumn or column name, optional

The second-of-minute and its micro-fraction to represent, from 0 to 60. The value can be either an integer like 13, or a fraction like 13.123. If the sec argument equals to 60, the seconds field is set to 0 and 1 minute is added to the final timestamp. Required when creating timestamps from individual components. Must be used with years, months, days, hours, and mins.

timezoneColumn or column name, optional

The time zone identifier. For example, CET, UTC, and etc.

dateColumn or column name, optional

The date to represent, in valid DATE format. Required when creating timestamps from date and time components. Must be used with time parameter only.

timeColumn or column name, optional

The time to represent, in valid TIME format. Required when creating timestamps from date and time components. Must be used with date parameter only.

Returns
Column

A new column that contains a timestamp.

Examples

>>> spark.conf.set("spark.sql.session.timeZone", "America/Los_Angeles")

Example 1: Make timestamp from years, months, days, hours, mins, secs, and timezone.

>>> import pyspark.sql.functions as sf
>>> df = spark.createDataFrame([[2014, 12, 28, 6, 30, 45.887, 'CET']],
...     ['year', 'month', 'day', 'hour', 'min', 'sec', 'tz'])
>>> df.select(
...     sf.make_timestamp(df.year, df.month, df.day, 'hour', df.min, df.sec, 'tz')
... ).show(truncate=False)
+----------------------------------------------------+
|make_timestamp(year, month, day, hour, min, sec, tz)|
+----------------------------------------------------+
|2014-12-27 21:30:45.887                             |
+----------------------------------------------------+

Example 2: Make timestamp from years, months, days, hours, mins, and secs (without timezone).

>>> import pyspark.sql.functions as sf
>>> df = spark.createDataFrame([[2014, 12, 28, 6, 30, 45.887]],
...     ['year', 'month', 'day', 'hour', 'min', 'sec'])
>>> df.select(
...     sf.make_timestamp(df.year, df.month, df.day, 'hour', df.min, df.sec)
... ).show(truncate=False)
+------------------------------------------------+
|make_timestamp(year, month, day, hour, min, sec)|
+------------------------------------------------+
|2014-12-28 06:30:45.887                         |
+------------------------------------------------+

Example 3: Make timestamp from date, time, and timezone.

>>> import pyspark.sql.functions as sf
>>> from datetime import date, time
>>> df = spark.range(1).select(
...     sf.lit(date(2014, 12, 28)).alias("date"),
...     sf.lit(time(6, 30, 45, 887000)).alias("time"),
...     sf.lit("CET").alias("tz")
... )
>>> df.select(
...     sf.make_timestamp(date=df.date, time=df.time, timezone=df.tz)
... ).show(truncate=False)
+------------------------------+
|make_timestamp(date, time, tz)|
+------------------------------+
|2014-12-27 21:30:45.887       |
+------------------------------+

Example 4: Make timestamp from date and time (without timezone).

>>> import pyspark.sql.functions as sf
>>> from datetime import date, time
>>> df = spark.range(1).select(
...     sf.lit(date(2014, 12, 28)).alias("date"),
...     sf.lit(time(6, 30, 45, 887000)).alias("time")
... )
>>> df.select(sf.make_timestamp(date=df.date, time=df.time)).show(truncate=False)
+--------------------------+
|make_timestamp(date, time)|
+--------------------------+
|2014-12-28 06:30:45.887   |
+--------------------------+
>>> spark.conf.unset("spark.sql.session.timeZone")