The reindex() method in Pandas allows you to change the index (rows), columns, or both of a DataFrame or Series.
Example
import pandas as pd
# create a sample DataFrame
data = {'A': [1, 2, 3],
'B': [4, 5, 6]}
df = pd.DataFrame(data)
# use reindex() to change the row
reindexed_df = df.reindex([0, 1, 2, 3])
print(reindexed_df)
'''
Output
A B
0 1.0 4.0
1 2.0 5.0
2 3.0 6.0
3 NaN NaN
'''
Here, the original indices of df were [0, 1, 2] and we changed them to [0, 1, 2, 3] to form reindexed_df using reindex(). Since one row (3) was added while doing so, it is filled with NaN values by default.
reindex() Syntax
The syntax of the reindex() method in Pandas is:
obj.reindex(labels=None, index=None, columns=None, method=None, fill_value=None, limit=None, tolerance=None, copy=True)
reindex() Arguments
The reindex() method takes the following arguments:
labels(optional) - new sequence of labelsindex(optional) - new sequence for the index (row labels)columns(optional) - new sequence for the column labelsmethod(optional) - specifies the method to use for filling holes in the reindexed datafill_value(optional) - substitute value to use when introducing missing datalimit(optional) - specifies the maximum number of consecutive elements to forward or backward filltolerance(optional) - specifies the maximum distance between the index and indexer valuescopy(optional) - specifies whether to always copy the data, default is True.
reindex() Return Value
The reindex() method returns an object (Series or DataFrame) conformed to the new index and columns.
Example 1: reindex() with columns Argument
import pandas as pd
# create a sample DataFrame
data = {'A': [1, 2, 3],
'B': [4, 5, 6]}
df = pd.DataFrame(data)
# use reindex() to change the columns
reindexed_df = df.reindex(columns=['B', 'A', 'C'])
print(reindexed_df)
Output
B A C 0 4 1 NaN 1 5 2 NaN 2 6 3 NaN
In this example, we rearranged the columns and added a new column C to the DataFrame using reindex().
Here,
- The order of existing columns
AandBwas changed. - A new column
Cwith defaultNaNvalues was added. - The original columns of
df,[A, B], were changed inreindexed_dfto[B, A, C].
Example 2: Using fill_value with reindex()
import pandas as pd
# sample DataFrame
data = {'Values': [10, 20, 30]}
df = pd.DataFrame(data, index=['a', 'b', 'c'])
# reindex and fill missing values with 0
reindexed_df = df.reindex(['a', 'x', 'b', 'y', 'c'], fill_value=0)
print(reindexed_df)
Output
Values a 10 x 0 b 20 y 0 c 30
In the example above, we filled the missing values with 0 instead of NaN using the fill_value argument.
Example 3: reindex() with method Argument
import pandas as pd
# sample DataFrame with numeric index
data = {'Value': [10, 20, 30]}
df = pd.DataFrame(data, index=[0, 2, 4])
# use reindex and forward fill values
reindexed_df = df.reindex([0, 1, 2, 3, 4], method='ffill')
print(reindexed_df)
Output
Value 0 10 1 10 2 20 3 20 4 30
Here, we passed ffill to the method argument to fill the missing values. The ffill method performs forward fill, meaning that the value at the previous index is filled in case of a missing value.
Example 4: Using the method and limit Arguments
import pandas as pd
# sample DataFrame with numeric index
data = {'Value': [10, 20, 30]}
df = pd.DataFrame(data, index=[0, 5, 10])
# use reindex with backward fill up to 2 places
reindexed_df = df.reindex(range(12), method='bfill', limit=2)
print(reindexed_df)
Output
Value
0 10.0
1 NaN
2 NaN
3 20.0
4 20.0
5 20.0
6 NaN
7 NaN
8 30.0
9 30.0
10 30.0
11 NaN
In this example, we used the method argument with bfill for backward filling and used the limit argument to restrict the filling to 2 places.
All other missing values except the first two values where backward filling is possible are NaN values.
Example 5: Using tolerance with Float Index
import pandas as pd
# sample DataFrame with float index
data = {'Value': [10, 20, 30]}
df = pd.DataFrame(data, index=[0.0, 1.4, 3.0])
# reindex with tolerance
reindexed_df = df.reindex([0, 1, 2, 3], method='nearest', tolerance=0.5)
print(reindexed_df)
Output
Value 0 10.0 1 20.0 2 NaN 3 30.0
Here, the tolerance argument defines the maximum distance between the desired and existing index values for filling.
Since the distance between 2 and 1.4 is 0.6 (>0.5), the nearest fill method is not applied in index 2.