pyspark.pandas.MultiIndex.symmetric_difference#

MultiIndex.symmetric_difference(other, result_name=None, sort=None)[source]#

Compute the symmetric difference of two MultiIndex objects.

Parameters

otherIndex or array-like
result_namelist
sortTrue or None, default None: Whether to sort the resulting index. * True : Attempt to sort the result. * None : Do not sort the result.

Returns

symmetric_differenceMultiIndex

Notes

symmetric_difference contains elements that appear in either idx1 or idx2 but not both. Equivalent to the Index created by idx1.difference(idx2) | idx2.difference(idx1) with duplicates dropped.

Examples

>>> midx1 = pd.MultiIndex([['lama', 'cow', 'falcon'],
...                        ['speed', 'weight', 'length']],
...                       [[0, 0, 0, 1, 1, 1, 2, 2, 2],
...                        [0, 0, 0, 0, 1, 2, 0, 1, 2]])
>>> midx2 = pd.MultiIndex([['pandas-on-Spark', 'cow', 'falcon'],
...                        ['speed', 'weight', 'length']],
...                       [[0, 0, 0, 1, 1, 1, 2, 2, 2],
...                        [0, 0, 0, 0, 1, 2, 0, 1, 2]])
>>> s1 = ps.Series([45, 200, 1.2, 30, 250, 1.5, 320, 1, 0.3],
...                index=midx1)
>>> s2 = ps.Series([45, 200, 1.2, 30, 250, 1.5, 320, 1, 0.3],
...              index=midx2)

>>> s1.index.symmetric_difference(s2.index)  
MultiIndex([('pandas-on-Spark', 'speed'),
            (  'lama', 'speed')],
           )

You can set names of the result Index.

>>> s1.index.symmetric_difference(s2.index, result_name=['a', 'b'])  
MultiIndex([('pandas-on-Spark', 'speed'),
            (  'lama', 'speed')],
           names=['a', 'b'])

You can set sort to True, if you want to sort the resulting index.

>>> s1.index.symmetric_difference(s2.index, sort=True)  
MultiIndex([('pandas-on-Spark', 'speed'),
            (  'lama', 'speed')],
           )

You can also use the ^ operator:

>>> s1.index ^ s2.index  
MultiIndex([('pandas-on-Spark', 'speed'),
            (  'lama', 'speed')],
           )