pyspark.pandas.Series.autocorr#
- Series.autocorr(lag=1)[source]#
- Compute the lag-N autocorrelation. - This method computes the Pearson correlation between the Series and its shifted self. - Note - the current implementation of rank uses Spark’s Window without specifying partition specification. This leads to moveing all data into a single partition in a single machine and could cause serious performance degradation. Avoid this method with very large datasets. - New in version 3.4.0. - Parameters
- lagint, default 1
- Number of lags to apply before performing autocorrelation. 
 
- Returns
- float
- The Pearson correlation between self and self.shift(lag). 
 
 - See also - Series.corr
- Compute the correlation between two Series. 
- Series.shift
- Shift index by desired number of periods. 
- DataFrame.corr
- Compute pairwise correlation of columns. 
 - Notes - If the Pearson correlation is not well defined return ‘NaN’. - Examples - >>> s = ps.Series([.2, .0, .6, .2, np.nan, .5, .6]) >>> s.autocorr() -0.141219... >>> s.autocorr(0) 1.0... >>> s.autocorr(2) 0.970725... >>> s.autocorr(-3) 0.277350... >>> s.autocorr(5) -1.000000... >>> s.autocorr(6) nan - If the Pearson correlation is not well defined, then ‘NaN’ is returned. - >>> s = ps.Series([1, 0, 0, 0]) >>> s.autocorr() nan