pyspark.pandas.DataFrame.align¶
- 
DataFrame.align(other: Union[DataFrame, Series], join: str = 'outer', axis: Union[int, str, None] = None, copy: bool = True) → Tuple[DataFrame, Union[DataFrame, Series]][source]¶
- Align two objects on their axes with the specified join method. - Join method is specified for each axis Index. - Parameters
- otherDataFrame or Series
- join{{‘outer’, ‘inner’, ‘left’, ‘right’}}, default ‘outer’
- axisallowed axis of the other object, default None
- Align on index (0), columns (1), or both (None). 
- copybool, default True
- Always returns new objects. If copy=False and no reindexing is required then original objects are returned. 
 
- Returns
- (left, right)(DataFrame, type of other)
- Aligned objects. 
 
 - Examples - >>> ps.set_option("compute.ops_on_diff_frames", True) >>> df1 = ps.DataFrame({"a": [1, 2, 3], "b": ["a", "b", "c"]}, index=[10, 20, 30]) >>> df2 = ps.DataFrame({"a": [4, 5, 6], "c": ["d", "e", "f"]}, index=[10, 11, 12]) - Align both axis: - >>> aligned_l, aligned_r = df1.align(df2) >>> aligned_l.sort_index() a b c 10 1.0 a NaN 11 NaN None NaN 12 NaN None NaN 20 2.0 b NaN 30 3.0 c NaN >>> aligned_r.sort_index() a b c 10 4.0 NaN d 11 5.0 NaN e 12 6.0 NaN f 20 NaN NaN None 30 NaN NaN None - Align only axis=0 (index): - >>> aligned_l, aligned_r = df1.align(df2, axis=0) >>> aligned_l.sort_index() a b 10 1.0 a 11 NaN None 12 NaN None 20 2.0 b 30 3.0 c >>> aligned_r.sort_index() a c 10 4.0 d 11 5.0 e 12 6.0 f 20 NaN None 30 NaN None - Align only axis=1 (column): - >>> aligned_l, aligned_r = df1.align(df2, axis=1) >>> aligned_l.sort_index() a b c 10 1 a NaN 20 2 b NaN 30 3 c NaN >>> aligned_r.sort_index() a b c 10 4 NaN d 11 5 NaN e 12 6 NaN f - Align with the join type “inner”: - >>> aligned_l, aligned_r = df1.align(df2, join="inner") >>> aligned_l.sort_index() a 10 1 >>> aligned_r.sort_index() a 10 4 - Align with a Series: - >>> s = ps.Series([7, 8, 9], index=[10, 11, 12]) >>> aligned_l, aligned_r = df1.align(s, axis=0) >>> aligned_l.sort_index() a b 10 1.0 a 11 NaN None 12 NaN None 20 2.0 b 30 3.0 c >>> aligned_r.sort_index() 10 7.0 11 8.0 12 9.0 20 NaN 30 NaN dtype: float64 - >>> ps.reset_option("compute.ops_on_diff_frames")