pyspark.pandas.DataFrame.ewm

DataFrame.ewm(com: Optional[float] = None, span: Optional[float] = None, halflife: Optional[float] = None, alpha: Optional[float] = None, min_periods: Optional[int] = None, ignore_na: bool = False) → ExponentialMoving[FrameLike]

Provide exponentially weighted window transformations.

Note

‘min_periods’ in pandas-on-Spark works as a fixed window size unlike pandas. Unlike pandas, NA is also counted as the period. This might be changed soon.

New in version 3.4.0.

Parameters
com: float, optional

Specify decay in terms of center of mass. alpha = 1 / (1 + com), for com >= 0.

span: float, optional

Specify decay in terms of span. alpha = 2 / (span + 1), for span >= 1.

halflife: float, optional

Specify decay in terms of half-life. alpha = 1 - exp(-ln(2) / halflife), for halflife > 0.

alpha: float, optional

Specify smoothing factor alpha directly. 0 < alpha <= 1.

min_periods: int, default None

Minimum number of observations in window required to have a value (otherwise result is NA).

ignore_na: bool, default False

Ignore missing values when calculating weights.

  • When ignore_na=False (default), weights are based on absolute positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are \((1-lpha)^2\) and \(1\) if adjust=True, and \((1-lpha)^2\) and \(lpha\) if adjust=False.

  • When ignore_na=True, weights are based on relative positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are \(1-lpha\) and \(1\) if adjust=True, and \(1-lpha\) and \(lpha\) if adjust=False.

Returns
a Window sub-classed for the operation