pyspark.pandas.DataFrame.itertuples¶
- 
DataFrame.itertuples(index: bool = True, name: Optional[str] = 'PandasOnSpark') → Iterator[Tuple][source]¶
- Iterate over DataFrame rows as namedtuples. - Parameters
- indexbool, default True
- If True, return the index as the first element of the tuple. 
- namestr or None, default “PandasOnSpark”
- The name of the returned namedtuples or None to return regular tuples. 
 
- Returns
- iterator
- An object to iterate over namedtuples for each row in the DataFrame with the first field possibly being the index and following fields being the column values. 
 
 - See also - DataFrame.iterrows
- Iterate over DataFrame rows as (index, Series) pairs. 
- DataFrame.items
- Iterate over (column name, Series) pairs. 
 - Notes - The column names will be renamed to positional names if they are invalid Python identifiers, repeated, or start with an underscore. - Examples - >>> df = ps.DataFrame({'num_legs': [4, 2], 'num_wings': [0, 2]}, ... index=['dog', 'hawk']) >>> df num_legs num_wings dog 4 0 hawk 2 2 - >>> for row in df.itertuples(): ... print(row) ... PandasOnSpark(Index='dog', num_legs=4, num_wings=0) PandasOnSpark(Index='hawk', num_legs=2, num_wings=2) - By setting the index parameter to False we can remove the index as the first element of the tuple: - >>> for row in df.itertuples(index=False): ... print(row) ... PandasOnSpark(num_legs=4, num_wings=0) PandasOnSpark(num_legs=2, num_wings=2) - With the name parameter set we set a custom name for the yielded namedtuples: - >>> for row in df.itertuples(name='Animal'): ... print(row) ... Animal(Index='dog', num_legs=4, num_wings=0) Animal(Index='hawk', num_legs=2, num_wings=2)