Pandas iterrows()

The iterrows() method in Pandas is used to iterate over the rows of a DataFrame.

Example

import pandas as pd

# sample DataFrame 
df = pd.DataFrame({
    'Fruit': ['Apple', 'Banana', 'Cherry']
})

# iterate over rows using iterrows() for index, row in df.iterrows(): print(f"At index {index}, the fruit is {row['Fruit']}.")
''' Output At index 0, the fruit is Apple. At index 1, the fruit is Banana. At index 2, the fruit is Cherry. '''

iterrows() Syntax

The syntax of the iterrows() method in Pandas is:

for index, row_series in dataframe.iterrows():
    # do something with index and row_series

Where,

  • index - index of the current row
  • row_series - data of the current row

iterrows() Return Value

The iterrows() method on a pandas DataFrame returns an iterator that yields pairs (tuples) containing the index and the data of each row.


Example 1: Basic Iteration Using iterrows()

import pandas as pd

# create a DataFrame 
df = pd.DataFrame({
    'Names': ['Alice', 'Bob', 'Charlie'],
    'Scores': [85, 90, 78]
})

# iterate over each row in the DataFrame using iterrows() for index, row in df.iterrows(): # for each row, print the name and score using formatted string print(f"{row['Names']} scored {row['Scores']} points.")

Output

Alice scored 85 points.
Bob scored 90 points.
Charlie scored 78 points.

In the above example, we have used the iterrows() to loop over rows of the df DataFrame.

For each iteration (row) inside the loop:

  • The index of the row is stored in the index variable.
  • The data of the row (as a Series) is stored in the row variable.
  • The values in the Names and Scores columns are accessed using row['Names'] and row['Scores'] respectively.

Example 2: Filtering Rows With Specific Criteria

import pandas as pd

# create a sample DataFrame 
df = pd.DataFrame({
    'Names': ['Alice', 'Bob', 'Charlie', 'David'],
    'Scores': [85, 78, 90, 72]
})

# initialize an empty list to store rows that meet the condition
filtered_rows = []

# iterate through the DataFrame using iterrows() for index, row in df.iterrows(): # if score for the current row is greater than 80, append the row to filtered_rows if row['Scores'] > 80: filtered_rows.append(row)
# convert the list of filtered rows back into a DataFrame filtered_df = pd.DataFrame(filtered_rows) # display the resulting filtered DataFrame print(filtered_df)

Output

    Names   Scores
0   Alice      85
2  Charlie     90

In the above example, we have used the iterrows() method to iterate over the df DataFrame.

We filtered rows where scores are greater than 80, and appended these rows to the filtered_rows list.


Example 3: Modifying the DataFrame within the Loop

import pandas as pd

# creating a DataFrame 
df = pd.DataFrame({
    'Names': ['Alice', 'Bob', 'Charlie'],
    'Scores': [85, 70, 92]
})

# iterate over each row of the DataFrame
for index, row in df.iterrows():
    # assign grades based on scores
    if row['Scores'] >= 90:
        grade = 'A'
    elif row['Scores'] >= 80:
        grade = 'B'
    elif row['Scores'] >= 70:
        grade = 'C'
    else:
        grade = 'F'

    # set the grade in the 'Grade' column for the current row
    df.at[index, 'Grade'] = grade

# display DataFrame with names, scores, and the assigned grades
print(df)

Output

     Names  Scores Grade
0    Alice    85     B
1    Bob      70     C
2    Charlie  92     A

Here, we used iterrows() to iterate through each row. Based on the value in the Scores column for each row, we determined the grade.

We then set the grade in the Grade column for the current row using the .at accessor.

However, it's important to note that modifying a DataFrame within a loop using iterrows() can be inefficient.

This is because iterrows() returns a copy of each row, not a view, so modifications might not perform as expected and could lead to a SettingWithCopyWarning.


Example 4: Using Multiple Columns With iterrows()

import pandas as pd

# sample DataFrame
df = pd.DataFrame({
    'Product': ['Widget', 'Gadget', 'Doodad'],
    'Price': [25.50, 40.00, 15.75],
    'Quantity': [100, 50, 200]
})

# calculate total sales for each product using Price and Quantity columns for index, row in df.iterrows(): total_sales = row['Price'] * row['Quantity'] df.at[index, 'TotalSales'] = total_sales
print(df)

Output

   Product  Price  Quantity  TotalSales
0   Widget  25.50       100      2550.0
1   Gadget  40.00        50      2000.0
2   Doodad  15.75       200      3150.0

In the above example, we've added a new column TotalSales, which is computed by multiplying the Price and Quantity columns for each product.

We accessed and operated on multiple columns Price and Quantity within the loop.