The drop()
method in Pandas is used to remove rows or columns from a DataFrame or Series.
Example
import pandas as pd
# create a sample DataFrame
data = {'A': [1, 2, 3],
'B': [4, 5, 6],
'C': [7, 8, 9]}
df = pd.DataFrame(data)
# drop a row by index (axis=0)
df.drop(1, axis=0, inplace=True)
# display DataFrame after dropping the row
print(df)
'''
Output
A B C
0 1 4 7
2 3 6 9
'''
drop() Syntax
The syntax of the drop()
method in Pandas is:
drop(labels, axis=0/1, index=None, columns=None, level=None, inplace=False, errors='raise')
drop() Arguments
The drop()
method takes following arguments:
labels
- single label or list of labels for dropping rows or columnsaxis
(optional) -axis=0
to drop rows andaxis=1
to drop columnsindex
(optional) - drops rows using index instead of labelscolumns
(optional) - drops columns using columns instead of labels- level (optional) - specifies the level from which to drop labels
inplace
(optional) -True
value modifies the DataFrame directly,False
(default) returns a new DataFrame
errors
(optional) - handles missing labels
drop() Return Value
The drop()
method in Pandas returns a new DataFrame or Series with the specified rows or columns removed, depending on whether you are dropping rows or columns.
Example 1: Drop Row and Column From a DataFrame
import pandas as pd
# create a sample DataFrame
data = {'A': [1, 2, 3],
'B': [4, 5, 6],
'C': [7, 8, 9]}
df = pd.DataFrame(data)
# display the original DataFrame
print("Original DataFrame:")
print(df)
# drop a row by index (axis=0)
# drop the second row (index 1)
df.drop(1, axis=0, inplace=True)
# display the modified DataFrame after dropping the row
print("\nDataFrame after dropping a row:")
print(df)
# drop a column by name (axis=1)
# drop 'B' column
df.drop('B', axis=1, inplace=True)
# display the modified DataFrame after dropping the column
print("\nDataFrame after dropping a column:")
print(df)
Output
Original DataFrame:
A B C
0 1 4 7
1 2 5 8
2 3 6 9
DataFrame after dropping a row:
A B C
0 1 4 7
2 3 6 9
DataFrame after dropping a column:
A C
0 1 7
2 3 9
In this example, we first created the DataFrame named df, and then we used the drop()
method to drop a row in index 1 and a column named 'B'
from the DataFrame.
Setting inplace=True
modifies the DataFrame in place, and we can see the changes in the printed output.
Example 2: Drop Multiple Rows and Columns From a DataFrame
import pandas as pd
# create a sample DataFrame
data = {'A': [1, 2, 3, 4],
'B': [5, 6, 7, 8],
'C': [9, 10, 11, 12],
'D': [13, 14, 15, 16]}
df = pd.DataFrame(data)
# display the original DataFrame
print("Original DataFrame:")
print(df)
# drop multiple rows by index (axis=0)
# drop rows 1 and 2
rows_to_drop = [1, 2]
df.drop(rows_to_drop, axis=0, inplace=True)
# display the modified DataFrame after dropping rows
print("\nDataFrame after dropping multiple rows:")
print(df)
# drop multiple columns by name (axis=1)
# drop columns 'B' and 'D'
columns_to_drop = ['B', 'D']
df.drop(columns_to_drop, axis=1, inplace=True)
# display the modified DataFrame after dropping columns
print("\nDataFrame after dropping multiple columns:")
print(df)
Output
Original DataFrame: A B C D 0 1 5 9 13 1 2 6 10 14 2 3 7 11 15 3 4 8 12 16 DataFrame after dropping multiple rows: A B C D 0 1 5 9 13 3 4 8 12 16 DataFrame after dropping multiple columns: A C 0 1 9 3 4 12
Here, we first create the df DataFrame and then use the drop()
method to drop multiple rows (rows 1 and 2) and multiple columns ('B'
and 'D'
) from df.
Example 3: Use index and columns Arguments in drop()
The index
argument allows us to specify the label(s) of rows we want to drop. And the columns
argument allows us to specify the label(s) of columns we want to drop.
Note: index
and columns
provides an alternative way to specify the rows and columns to drop compared to using the labels
argument with axis=0
and axis=1
respectively.
Let's look at an example.
import pandas as pd
# create a DataFrame
data = {'A': [1, 2, 3],
'B': [4, 5, 6],
'C': [7, 8, 9]}
# assign custom row labels
df = pd.DataFrame(data, index=['X', 'Y', 'Z'])
# display original DataFrame
print("Original DataFrame:")
print(df)
# drop specific rows using the index argument (inplace=False)
# drop rows 'Y' and 'Z' without modifying original DataFrame
rows_to_drop = ['Y', 'Z']
df_dropped_rows = df.drop(index=rows_to_drop, inplace=False)
# display the modified DataFrame after dropping rows
print("\nDataFrame after dropping specific rows:")
print(df_dropped_rows)
# drop specific columns using the columns argument (inplace=False)
# drop columns 'A' and 'C' without modifying original DataFrame
columns_to_drop = ['A', 'C']
df_dropped_columns = df.drop(columns=columns_to_drop, inplace=False)
# display the modified DataFrame after dropping columns
print("\nDataFrame after dropping specific columns:")
print(df_dropped_columns)
Output
Original DataFrame:
A B C
X 1 4 7
Y 2 5 8
Z 3 6 9
DataFrame after dropping specific rows:
A B C
X 1 4 7
DataFrame after dropping specific columns:
B
X 4
Y 5
Z 6
Here, we first displayed the original DataFrame with the custom row labels 'X'
, 'Y'
, and 'Z'
.
Then, we displayed the DataFrame
- after dropping rows
'Y'
and'Z'
, leaving only row'X'
, and - after dropping columns
'A'
and'C'
, leaving only column'B'
.
Example 4: Drop Labels From MultiIndex DataFrame
import pandas as pd
# create a MultiIndex DataFrame
data = {'A': [1, 2, 3, 4], 'B': [5, 6, 7, 8]}
index = pd.MultiIndex.from_tuples([('X', 'a'),
('X', 'b'),
('Y', 'a'),
('Y', 'b')],
names=['Group', 'Label'])
df = pd.DataFrame(data, index=index)
# drop rows with 'X' in the 'Group' level
df_dropped = df.drop('X', level='Group')
print(df_dropped)
Output
A B
Group Label
Y a 3 7
b 4 8
In the above example, we used the level
parameter to specify that we want to drop rows where the Group
level is equal to X
.
The resulting DataFrame df_dropped contains only rows with Y
in the Group
.
Example 5: Error Handling Using errors Argument Inside drop()
In Pandas, the errors
argument in the drop()
method determines how errors are handled when the specified labels are not found in the DataFrame.
This argument allows us to control the behavior of the drop operation when it encounters labels that do not exist in the DataFrame.
When,
errors = raise
(Default) - if any of the specified labels are not found in the DataFrame, it raises theKeyError
exception, indicating that the label(s) could not be found.
errors = ignore
- ignores the label and drop operation will proceed even if the specified label is not found
Let's look at an example.
import pandas as pd
# create a sample DataFrame
data = {'A': [1, 2, 3],
'B': [4, 5, 6],
'C': [7, 8, 9]}
df = pd.DataFrame(data, index=['X', 'Y', 'Z'])
# attempt to drop a non-existent row label with errors='raise' (default)
try:
df.drop(index='W', errors='raise')
except KeyError as e:
print(f"Error: {e}")
# attempt to drop a non-existent row label with errors='ignore'
df_dropped = df.drop(index='W', errors='ignore')
# display the result after attempting to drop the non-existent row label
print("\nResult after dropping non-existent row with errors='ignore':")
print(df_dropped)
Output
ERROR!
Error: "['W'] not found in axis"
Result after dropping non-existent row with errors='ignore':
A B C
X 1 4 7
Y 2 5 8
Z 3 6 9
Here, we first attempt to drop a non-existent row label 'W'
with errors='raise'
, which raises a KeyError
because 'W'
is not found in the df DataFrame.
Then, we attempt to drop the same non-existent row label 'W'
with errors='ignore'
, which silently ignores the label 'W'
and drops nothing, resulting in the original DataFrame.