The reindex()
method in Pandas allows you to change the index (rows), columns, or both of a DataFrame or Series.
Example
import pandas as pd
# create a sample DataFrame
data = {'A': [1, 2, 3],
'B': [4, 5, 6]}
df = pd.DataFrame(data)
# use reindex() to change the row
reindexed_df = df.reindex([0, 1, 2, 3])
print(reindexed_df)
'''
Output
A B
0 1.0 4.0
1 2.0 5.0
2 3.0 6.0
3 NaN NaN
'''
Here, the original indices of df were [0, 1, 2]
and we changed them to [0, 1, 2, 3]
to form reindexed_df
using reindex()
. Since one row (3) was added while doing so, it is filled with NaN
values by default.
reindex() Syntax
The syntax of the reindex()
method in Pandas is:
obj.reindex(labels=None, index=None, columns=None, method=None, fill_value=None, limit=None, tolerance=None, copy=True)
reindex() Arguments
The reindex()
method takes the following arguments:
labels
(optional) - new sequence of labelsindex
(optional) - new sequence for the index (row labels)columns
(optional) - new sequence for the column labelsmethod
(optional) - specifies the method to use for filling holes in the reindexed datafill_value
(optional) - substitute value to use when introducing missing datalimit
(optional) - specifies the maximum number of consecutive elements to forward or backward filltolerance
(optional) - specifies the maximum distance between the index and indexer valuescopy
(optional) - specifies whether to always copy the data, default is True.
reindex() Return Value
The reindex()
method returns an object (Series or DataFrame) conformed to the new index and columns.
Example 1: reindex() with columns Argument
import pandas as pd
# create a sample DataFrame
data = {'A': [1, 2, 3],
'B': [4, 5, 6]}
df = pd.DataFrame(data)
# use reindex() to change the columns
reindexed_df = df.reindex(columns=['B', 'A', 'C'])
print(reindexed_df)
Output
B A C 0 4 1 NaN 1 5 2 NaN 2 6 3 NaN
In this example, we rearranged the columns and added a new column C
to the DataFrame using reindex()
.
Here,
- The order of existing columns
A
andB
was changed. - A new column
C
with defaultNaN
values was added. - The original columns of
df
,[A, B]
, were changed inreindexed_df
to[B, A, C]
.
Example 2: Using fill_value with reindex()
import pandas as pd
# sample DataFrame
data = {'Values': [10, 20, 30]}
df = pd.DataFrame(data, index=['a', 'b', 'c'])
# reindex and fill missing values with 0
reindexed_df = df.reindex(['a', 'x', 'b', 'y', 'c'], fill_value=0)
print(reindexed_df)
Output
Values a 10 x 0 b 20 y 0 c 30
In the example above, we filled the missing values with 0 instead of NaN
using the fill_value
argument.
Example 3: reindex() with method Argument
import pandas as pd
# sample DataFrame with numeric index
data = {'Value': [10, 20, 30]}
df = pd.DataFrame(data, index=[0, 2, 4])
# use reindex and forward fill values
reindexed_df = df.reindex([0, 1, 2, 3, 4], method='ffill')
print(reindexed_df)
Output
Value 0 10 1 10 2 20 3 20 4 30
Here, we passed ffill
to the method
argument to fill the missing values. The ffill
method performs forward fill, meaning that the value at the previous index is filled in case of a missing value.
Example 4: Using the method and limit Arguments
import pandas as pd
# sample DataFrame with numeric index
data = {'Value': [10, 20, 30]}
df = pd.DataFrame(data, index=[0, 5, 10])
# use reindex with backward fill up to 2 places
reindexed_df = df.reindex(range(12), method='bfill', limit=2)
print(reindexed_df)
Output
Value 0 10.0 1 NaN 2 NaN 3 20.0 4 20.0 5 20.0 6 NaN 7 NaN 8 30.0 9 30.0 10 30.0 11 NaN
In this example, we used the method argument with bfill
for backward filling and used the limit argument to restrict the filling to 2 places.
All other missing values except the first two values where backward filling is possible are NaN
values.
Example 5: Using tolerance with Float Index
import pandas as pd
# sample DataFrame with float index
data = {'Value': [10, 20, 30]}
df = pd.DataFrame(data, index=[0.0, 1.4, 3.0])
# reindex with tolerance
reindexed_df = df.reindex([0, 1, 2, 3], method='nearest', tolerance=0.5)
print(reindexed_df)
Output
Value 0 10.0 1 20.0 2 NaN 3 30.0
Here, the tolerance
argument defines the maximum distance between the desired and existing index values for filling.
Since the distance between 2 and 1.4 is 0.6 (>0.5), the nearest fill method is not applied in index 2.