Pandas rank()

The rank() method in Pandas is used to compute the rank of each element in the Series or DataFrame columns, such as ranking scores from highest to lowest.

Example

import pandas as pd

# sample DataFrame
data = {'Score': [78, 85, 96, 86, 90]}

df = pd.DataFrame(data)

# rank the score
print(df.rank())

'''
Output

   Score
0    1.0
1    2.0
2    5.0
3    3.0
4    4.0
'''

rank() Syntax

The syntax of the rank() method in Pandas is:

df.rank(axis=0, method='average', numeric_only=None, na_option='keep', ascending=True, pct=False)

rank() Arguments

The rank() method takes the following arguments:

  • axis: specifies whether to rank rows or columns
  • method: specifies how to handle equal values
  • numeric_only: rank only numeric data if True
  • na_option: specifies how to handle NaN
  • ascending: specifies whether to rank in ascending order
  • pct: specifies whether to display the rank as a percentage.

rank() Return Value

The rank() method returns a DataFrame or Series (depending on the input) with the ranks of the data.


Example 1: Basic Ranking

import pandas as pd

data = {'Score': [78, 85, 96, 85, 90]}
df = pd.DataFrame(data)

df['Rank'] = df['Score'].rank()

print(df)

Output

   Score  Rank
0     78   1.0
1     85   2.5
2     96   5.0
3     85   2.5
4     90   4.0

In this example, we ranked the scores using rank(). The rank() method handles equal values by assigning the average rank of those values.


Example 2: Ranking with Method

import pandas as pd

data = {'Score': [78, 85, 96, 85, 90]}
df = pd.DataFrame(data)

# rank using the 'max' method for ties
df['Rank'] = df['Score'].rank(method='max')

print(df)

Output

   Score  Rank
0     78   1.0
1     85   3.0
2     96   5.0
3     85   3.0
4     90   4.0

In this example, we used method='max' to handle equal values. The max method assigns maximum possible rank to the equal values.


Example 3: Ranking in Descending Order

import pandas as pd

data = {'Score': [78, 85, 96, 85, 90]}
df = pd.DataFrame(data)

# rank in descending order
df['Rank'] = df['Score'].rank(ascending=False)

print(df)

Output

0     78   5.0
1     85   3.5
2     96   1.0
3     85   3.5
4     90   2.0

Here, we ranked the scores in descending order with the highest score receiving the lowest rank.


Example 4: Ranking Numeric Data Only

The numeric_only argument is used to rank only numeric columns when applied to a DataFrame.

import pandas as pd

data = {
    'Score': [78, 85, 96, 85, 90],
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva']
}
df = pd.DataFrame(data)

# rank all columns
print('All columns:')
print(df.rank())
print()

# rank numeric columns only
print('Numeric columns:')
print(df.rank(numeric_only=True))

Output

All columns:
   Score  Name
0    1.0   1.0
1    2.5   2.0
2    5.0   3.0
3    2.5   4.0
4    4.0   5.0

Numeric columns:
   Score
0    1.0
1    2.5
2    5.0
3    2.5
4    4.0

Here, the Name column is not ranked in the second case due to the numeric_only=True argument.


Example 5: Handling NaN

We can use the na_option argument to determine how NaN values in the data are handled.

import pandas as pd

data = {'Score': [78, 85, None, 85, 90]}

df = pd.DataFrame(data)

# rank with NaN placed at the bottom
df['Rank'] = df['Score'].rank(na_option='bottom')

print(df)

Output

   Score  Rank
0   78.0   1.0
1   85.0   2.5
2    NaN   5.0
3   85.0   2.5
4   90.0   4.0

In this case, we assigned the highest rank to the NaN value.


Example 6: Ranking as a Percentage

The pct argument returns rankings as relative percentages.

import pandas as pd

data = {'Score': [78, 85, 96, 85, 90]}

df = pd.DataFrame(data)

# rank the scores as percentages
df['Rank_pct'] = df['Score'].rank(pct=True)

print(df)

Output

   Score  Rank_pct
0     78       0.2
1     85       0.5
2     96       1.0
3     85       0.5
4     90       0.8

Here, the rank is displayed as a relative percentage of the highest rank.