The quantile()
method in Pandas returns values at the given quantile over the requested axis.
A quantile is a way to understand the distribution of data within a DataFrame or Series.
Example
import pandas as pd
# sample DataFrame
data = {'A': [1, 2, 3],
'B': [4, 5, 6]}
df = pd.DataFrame(data)
# calculate the median, which is the 50th percentile or quantile(0.5)
median = df.quantile(0.5)
print(median)
'''
Output
A 2.0
B 5.0
Name: 0.5, dtype: float64
'''
quantile() Syntax
The syntax of the quantile()
method in Pandas is:
df.quantile(q=0.5, axis=0, numeric_only=True, interpolation='linear')
quantile() Arguments
The quantile()
method has the following arguments.
q
(optional): the quantile to compute, which must be between 0 and 1 (default 0.5)axis
(optional): the axis to compute the quantile alongnumeric_only
(optional): ifFalse
, the quantile of datetime and timedelta data will be computed as well (defaultTrue
)interpolation
(optional): specifies the interpolation method to use when the desired quantile lies between two data points.
quantile() Return Value
The quantile()
method returns a scalar or Series if q
is a single quantile, and a DataFrame if q
is an array of multiple quantiles.
Example 1: Single Quantile
import pandas as pd
data = {'A': [1, 3, 5, 7],
'B': [2, 4, 6, 8]}
df = pd.DataFrame(data)
# calculate the 25th percentile
quantile_25 = df.quantile(0.25)
print(quantile_25)
Output
A 2.5 B 3.5 Name: 0.25, dtype: float64
Here, we calculated the 25th percentile (first quartile) for each column.
Example 2: Multiple Quantiles
import pandas as pd
data = {'A': [1, 3, 5, 7],
'B': [2, 4, 6, 8]}
df = pd.DataFrame(data)
# calculate the 25th and 75th percentiles
quantiles = df.quantile([0.25, 0.75])
print(quantiles)
Output
A B 0.25 2.5 3.5 0.75 5.5 6.5
In this example, we calculated multiple quantiles for each column, resulting in a DataFrame showing the 25th and 75th percentiles.
Example 3: Quantile with Interpolation
import pandas as pd
data = {'A': [1, 3, 5, 7],
'B': [2, 4, 6, 8]}
df = pd.DataFrame(data)
# calculate the median with a different interpolation method
median_higher = df.quantile(0.5, interpolation='higher')
print(median_higher)
Output
A 5 B 6 Name: 0.5, dtype: int64
In this example, we have set the interpolation parameter to 'higher'
.
By choosing 'higher'
, we force the quantile function to return the actual observed value from the dataset that is higher than the median position.