Pandas reindex()

The reindex() method in Pandas allows you to change the index (rows), columns, or both of a DataFrame or Series.

Example

import pandas as pd

# create a sample DataFrame
data = {'A': [1, 2, 3],
        'B': [4, 5, 6]}
df = pd.DataFrame(data)

# use reindex() to change the row
reindexed_df = df.reindex([0, 1, 2, 3])

print(reindexed_df)

'''
Output

     A    B
0  1.0  4.0
1  2.0  5.0
2  3.0  6.0
3  NaN  NaN
'''

Here, the original indices of df were [0, 1, 2] and we changed them to [0, 1, 2, 3] to form reindexed_df using reindex(). Since one row (3) was added while doing so, it is filled with NaN values by default.


reindex() Syntax

The syntax of the reindex() method in Pandas is:

obj.reindex(labels=None, index=None, columns=None, method=None, fill_value=None, limit=None, tolerance=None, copy=True)

reindex() Arguments

The reindex() method takes the following arguments:

  • labels (optional) - new sequence of labels
  • index (optional) - new sequence for the index (row labels)
  • columns (optional) - new sequence for the column labels
  • method (optional) - specifies the method to use for filling holes in the reindexed data
  • fill_value (optional) - substitute value to use when introducing missing data
  • limit (optional) - specifies the maximum number of consecutive elements to forward or backward fill
  • tolerance (optional) - specifies the maximum distance between the index and indexer values
  • copy (optional) - specifies whether to always copy the data, default is True.

reindex() Return Value

The reindex() method returns an object (Series or DataFrame) conformed to the new index and columns.


Example 1: reindex() with columns Argument

import pandas as pd

# create a sample DataFrame
data = {'A': [1, 2, 3],
        'B': [4, 5, 6]}
df = pd.DataFrame(data)

# use reindex() to change the columns
reindexed_df = df.reindex(columns=['B', 'A', 'C'])

print(reindexed_df)

Output

   B  A   C
0  4  1 NaN
1  5  2 NaN
2  6  3 NaN

In this example, we rearranged the columns and added a new column C to the DataFrame using reindex().

Here,

  • The order of existing columns A and B was changed.
  • A new column C with default NaN values was added.
  • The original columns of df, [A, B], were changed in reindexed_df to [B, A, C].

Example 2: Using fill_value with reindex()

import pandas as pd

# sample DataFrame
data = {'Values': [10, 20, 30]}
df = pd.DataFrame(data, index=['a', 'b', 'c'])

# reindex and fill missing values with 0
reindexed_df = df.reindex(['a', 'x', 'b', 'y', 'c'], fill_value=0)

print(reindexed_df)

Output

   Values
a     10
x      0
b     20
y      0
c     30

In the example above, we filled the missing values with 0 instead of NaN using the fill_value argument.


Example 3: reindex() with method Argument

import pandas as pd

# sample DataFrame with numeric index
data = {'Value': [10, 20, 30]}
df = pd.DataFrame(data, index=[0, 2, 4])

# use reindex and forward fill values
reindexed_df = df.reindex([0, 1, 2, 3, 4], method='ffill')

print(reindexed_df)

Output

   Value
0     10
1     10
2     20
3     20
4     30

Here, we passed ffill to the method argument to fill the missing values. The ffill method performs forward fill, meaning that the value at the previous index is filled in case of a missing value.


Example 4: Using the method and limit Arguments

import pandas as pd

# sample DataFrame with numeric index
data = {'Value': [10, 20, 30]}
df = pd.DataFrame(data, index=[0, 5, 10])

# use reindex with backward fill up to 2 places
reindexed_df = df.reindex(range(12), method='bfill', limit=2)

print(reindexed_df)

Output

    Value
0    10.0
1     NaN
2     NaN
3    20.0
4    20.0
5    20.0
6     NaN
7     NaN
8    30.0
9    30.0
10   30.0
11    NaN

In this example, we used the method argument with bfill for backward filling and used the limit argument to restrict the filling to 2 places.

All other missing values except the first two values where backward filling is possible are NaN values.


Example 5: Using tolerance with Float Index

import pandas as pd

# sample DataFrame with float index
data = {'Value': [10, 20, 30]}
df = pd.DataFrame(data, index=[0.0, 1.4, 3.0])

# reindex with tolerance
reindexed_df = df.reindex([0, 1, 2, 3], method='nearest', tolerance=0.5)

print(reindexed_df)

Output

   Value
0   10.0
1   20.0
2    NaN
3   30.0

Here, the tolerance argument defines the maximum distance between the desired and existing index values for filling.

Since the distance between 2 and 1.4 is 0.6 (>0.5), the nearest fill method is not applied in index 2.