Certification Courses

Created with over a decade of experience and thousands of feedback.

Learn Python

Learn HTML

Learn JavaScript

Learn SQL

Learn DSA

View all Courses on

Learn C

Learn C++

Learn Java

Pandas corr()

The corr() method in Pandas is used to compute the pairwise correlation coefficients of columns.

A correlation coefficient is a statistical measure that describes the extent to which two variables are related to each other.

Example

import pandas as pd

# sample DataFrame with numeric data
data = {'A': [3, 2, 1],
        'B': [4, 6, 5],
        'C': [7, 18, 91]}

df = pd.DataFrame(data)

# compute correlation matrix
correlation_matrix = df.corr()

print(correlation_matrix)

'''
Output

          A        B         C
A  1.000000 -0.50000 -0.919953
B -0.500000  1.00000  0.120470
C -0.919953  0.12047  1.000000
'''

Run Code

corr() Syntax

The syntax of the corr() method in Pandas is:

df.corr(method='pearson', min_periods=1, numeric_only=False)

corr() Arguments

The corr() method takes the following arguments:

method (optional): method to calculate correlation
min_periods (optional): minimum number of observations required per pair of columns to have a valid result
numeric_only (optional): whether to include only numeric data types

corr() Return Value

The corr() method returns a DataFrame containing correlation coefficients between columns.

Example 1: Default Pearson Correlation Coefficient

import pandas as pd

# sample DataFrame with numeric data
data = {'A': [3, 2, 1],
        'B': [4, 6, 5],
        'C': [7, 18, 91]}

df = pd.DataFrame(data)

# compute correlation matrix
correlation_matrix = df.corr()

print(correlation_matrix)

Run Code

Output

          A        B         C
A  1.000000 -0.50000 -0.919953
B -0.500000  1.00000  0.120470
C -0.919953  0.12047  1.000000

In this example, we demonstrated the default use of the corr() method for calculating the Pearson correlation coefficient for each pair of columns.

Example 2: Kendall Tau Correlation Coefficient

import pandas as pd

# sample DataFrame with numeric data
data = {'A': [3, 2, 1],
        'B': [4, 6, 5],
        'C': [7, 18, 91]}

df = pd.DataFrame(data)

# compute correlation matrix
correlation_matrix = df.corr(method='kendall')

print(correlation_matrix)

Run Code

Output

          A         B         C
A  1.000000 -0.333333 -1.000000
B -0.333333  1.000000  0.333333
C -1.000000  0.333333  1.000000

In this example, we calculated the Kendall Tau correlation coefficient for each pair of columns using method='kendall'.

To learn about correlation and different correlation methods in detail, please visit Pandas Correlation.

Example 3: Specify Minimum Number of Observations

import pandas as pd

# sample DataFrame with numeric data
data = {'A': [1, 2, 3, None, None],
        'B': [4, 7, None, None, None],
        'C': [7, 9, 8, None, None]}

df = pd.DataFrame(data)

# specify minimum number of observations required to perform computation
correlation_matrix = df.corr(min_periods=3)

print(correlation_matrix)

Run Code

Output

     A   B    C
A  1.0 NaN  0.5
B  NaN NaN  NaN
C  0.5 NaN  1.0

In this example, the DataFrame df contains None values representing missing data. By setting min_periods=3, we specified that at least three non-null observations are required to compute a correlation coefficient for each pair of columns.

Here, since the B column contains only two non-null values, the correlation coefficients involving B are not calculated.

Example 4: Calculate Correlation for Numeric Data Only

import pandas as pd

# sample DataFrame
data = {'A': [3, 2, 'A', 1],
        'B': [4, 6, 5, 7],
        'C': [7, 18.5, 91, 55]}

df = pd.DataFrame(data)

# compute correlation matrix
correlation_matrix = df.corr(numeric_only=True)

print(correlation_matrix)

Run Code

Output

         B        C
B  1.00000  0.24257
C  0.24257  1.00000

In this example, we used the numeric_only=True argument to skip the columns with non-numeric data. As a result, column A is excluded from the computation.

This argument is useful to avoid ValueError due to the presence of non-numeric data in the DataFrame.

Our premium learning platform, created with over a decade of experience and thousands of feedbacks.

Learn and improve your coding skills like never before.

Try Programiz PRO

Interactive Courses
Certificates
AI Help
2000+ Challenges

Popular Tutorials

Popular Examples

Reference Materials

Certification Courses

Learn Python practically
and Get Certified.

Popular Tutorials

Reference Materials

Popular Examples

Pandas corr()

Example

corr() Syntax

corr() Arguments

corr() Return Value

Example 1: Default Pearson Correlation Coefficient

Example 2: Kendall Tau Correlation Coefficient

Example 3: Specify Minimum Number of Observations

Example 4: Calculate Correlation for Numeric Data Only

Popular Tutorials

Popular Examples

Reference Materials

Certification Courses

Learn Python practically and Get Certified.

Popular Tutorials

Reference Materials

Popular Examples

Pandas corr()

Example

corr() Syntax

corr() Arguments

corr() Return Value

Example 1: Default Pearson Correlation Coefficient

Example 2: Kendall Tau Correlation Coefficient

Example 3: Specify Minimum Number of Observations

Example 4: Calculate Correlation for Numeric Data Only

Learn Python practically
and Get Certified.