The iterrows()
method in Pandas is used to iterate over the rows of a DataFrame.
Example
import pandas as pd
# sample DataFrame
df = pd.DataFrame({
'Fruit': ['Apple', 'Banana', 'Cherry']
})
# iterate over rows using iterrows()
for index, row in df.iterrows():
print(f"At index {index}, the fruit is {row['Fruit']}.")
'''
Output
At index 0, the fruit is Apple.
At index 1, the fruit is Banana.
At index 2, the fruit is Cherry.
'''
iterrows() Syntax
The syntax of the iterrows()
method in Pandas is:
for index, row_series in dataframe.iterrows():
# do something with index and row_series
Where,
index
- index of the current rowrow_series
- data of the current row
iterrows() Return Value
The iterrows()
method on a pandas DataFrame returns an iterator that yields pairs (tuples) containing the index and the data of each row.
Example 1: Basic Iteration Using iterrows()
import pandas as pd
# create a DataFrame
df = pd.DataFrame({
'Names': ['Alice', 'Bob', 'Charlie'],
'Scores': [85, 90, 78]
})
# iterate over each row in the DataFrame using iterrows()
for index, row in df.iterrows():
# for each row, print the name and score using formatted string
print(f"{row['Names']} scored {row['Scores']} points.")
Output
Alice scored 85 points. Bob scored 90 points. Charlie scored 78 points.
In the above example, we have used the iterrows()
to loop over rows of the df DataFrame.
For each iteration (row) inside the loop:
- The index of the row is stored in the index variable.
- The data of the row (as a Series) is stored in the row variable.
- The values in the
Names
andScores
columns are accessed usingrow['Names']
androw['Scores']
respectively.
Example 2: Filtering Rows With Specific Criteria
import pandas as pd
# create a sample DataFrame
df = pd.DataFrame({
'Names': ['Alice', 'Bob', 'Charlie', 'David'],
'Scores': [85, 78, 90, 72]
})
# initialize an empty list to store rows that meet the condition
filtered_rows = []
# iterate through the DataFrame using iterrows()
for index, row in df.iterrows():
# if score for the current row is greater than 80, append the row to filtered_rows
if row['Scores'] > 80:
filtered_rows.append(row)
# convert the list of filtered rows back into a DataFrame
filtered_df = pd.DataFrame(filtered_rows)
# display the resulting filtered DataFrame
print(filtered_df)
Output
Names Scores
0 Alice 85
2 Charlie 90
In the above example, we have used the iterrows()
method to iterate over the df DataFrame.
We filtered rows where scores are greater than 80, and appended these rows to the filtered_rows list.
Example 3: Modifying the DataFrame within the Loop
import pandas as pd
# creating a DataFrame
df = pd.DataFrame({
'Names': ['Alice', 'Bob', 'Charlie'],
'Scores': [85, 70, 92]
})
# iterate over each row of the DataFrame
for index, row in df.iterrows():
# assign grades based on scores
if row['Scores'] >= 90:
grade = 'A'
elif row['Scores'] >= 80:
grade = 'B'
elif row['Scores'] >= 70:
grade = 'C'
else:
grade = 'F'
# set the grade in the 'Grade' column for the current row
df.at[index, 'Grade'] = grade
# display DataFrame with names, scores, and the assigned grades
print(df)
Output
Names Scores Grade
0 Alice 85 B
1 Bob 70 C
2 Charlie 92 A
Here, we used iterrows()
to iterate through each row. Based on the value in the Scores
column for each row, we determined the grade.
We then set the grade in the Grade
column for the current row using the .at
accessor.
However, it's important to note that modifying a DataFrame within a loop using iterrows()
can be inefficient.
This is because iterrows()
returns a copy of each row, not a view, so modifications might not perform as expected and could lead to a SettingWithCopyWarning
.
Example 4: Using Multiple Columns With iterrows()
import pandas as pd
# sample DataFrame
df = pd.DataFrame({
'Product': ['Widget', 'Gadget', 'Doodad'],
'Price': [25.50, 40.00, 15.75],
'Quantity': [100, 50, 200]
})
# calculate total sales for each product using Price and Quantity columns
for index, row in df.iterrows():
total_sales = row['Price'] * row['Quantity']
df.at[index, 'TotalSales'] = total_sales
print(df)
Output
Product Price Quantity TotalSales 0 Widget 25.50 100 2550.0 1 Gadget 40.00 50 2000.0 2 Doodad 15.75 200 3150.0
In the above example, we've added a new column TotalSales
, which is computed by multiplying the Price
and Quantity
columns for each product.
We accessed and operated on multiple columns Price
and Quantity
within the loop.