Pandas itertuples()

The itertuples() method in Pandas is used to iterate over the rows of a DataFrame.

Example

import pandas as pd

# create a DataFrame
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})

# use itertuples() to iterate over rows for row in df.itertuples(): print(row)
''' Output Pandas(Index=0, A=1, B=3) Pandas(Index=1, A=2, B=4) '''

itertuples() Syntax

The syntax of the itertuples() method in Pandas is:

df.itertuples(index=True, name='Pandas')

Note: A typical use of itertuples() would look like this:

for row in df.itertuples(index=True, name='Pandas'):
 # do something using row

itertuples() Arguments

The itertuples() method takes following arguments:

  • index (optional) - a boolean to specify whether to include or exclude index
  • name (optional) - specifies the name of the namedtuple to be returned. If set to None, a regular tuple is returned instead.

Note: A namedtuple is a subclass of tuples with named fields. It's part of the collections module and provides a way to create tuple-like objects.


itertuples() Return Value

The itertuples() method returns an iterator that yields namedtuples for each row in the DataFrame.


Example 1: Basic Iteration Using itertuples()

import pandas as pd

# create a DataFrame
data = {'Column1': [1, 2, 3], 
             'Column2': ['A', 'B', 'C']}
df = pd.DataFrame(data)

# use itertuples() to iterate over rows for row in df.itertuples():
# access data from each row print(row.Column1, row.Column2)

Output

1 A
2 B
3 C

In the above example, we have used the itertuples() to loop over rows of the df DataFrame.

In each iteration of the loop, the code retrieves the values from Column1 and Column2 of the DataFrame using row.Column1 and row.Column2.


Example 2: Using itertuples() with and without the index Argument

import pandas as pd

# create a DataFrame
data = {'Column1': [10, 20, 30], 'Column2': ['A', 'B', 'C']}
df = pd.DataFrame(data)

# iterating with index=True print("Iterating with index=True:") for row in df.itertuples(index=True): print(row)
# iterating with index=False print("\nIterating with index=False:") for row in df.itertuples(index=False): print(row)

Output

Iterating with index=True:
Pandas(Index=0, Column1=10, Column2='A')
Pandas(Index=1, Column1=20, Column2='B')
Pandas(Index=2, Column1=30, Column2='C')

Iterating with index=False:
Pandas(Column1=10, Column2='A')
Pandas(Column1=20, Column2='B')
Pandas(Column1=30, Column2='C')

Here,

  1. index=True - includes the df DataFrame's index as the first element of each tuple
  2. index=False, excludes the index from the tuples, showing only the data from the DataFrame's columns.

Example 3: Provide Custom Name for the NamedTuple

import pandas as pd

# create a DataFrame
df = pd.DataFrame({'Column1': [1, 2, 3], 'Column2': ['A', 'B', 'C']})

# use itertuples() with a custom name for the namedtuple for row in df.itertuples(name='RowData'):
# access elements in the namedtuple print(f"Index: {row.Index}, Column1: {row.Column1}, Column2: {row.Column2}")

Output

Index: 0, Column1: 1, Column2: A
Index: 1, Column1: 2, Column2: B
Index: 2, Column1: 3, Column2: C

In this example, we have used itertuples() to iterate over the rows of the df DataFrame.

The name argument is set to RowData. This means each row is represented as a namedtuple called RowData.

Inside the loop, we accessed the index of the row with row.Index, and the data in Column1 and Column2 with row.Column1 and row.Column2, respectively.

Note: Naming the row as RowData in itertuples() enhances code readability by clearly indicating that each iteration deals with row data, making it easier to understand and maintain.