Pandas std()

The std() method in Pandas is used to compute the standard deviation of a given set of numeric values within a Series or DataFrame columns.

The standard deviation is a measure of the amount of variation or dispersion in a set of values.

Example

import pandas as pd

# sample DataFrame
data = {'A': [1, 2, 3, 4],
        'B': [5, 6, 7, 8]}

df = pd.DataFrame(data)

# calculate the standard deviation
std_dev = df.std()

print(std_dev)

'''
Output

A    1.290994
B    1.290994
dtype: float64
'''

std() Syntax

The syntax of the std() method in Pandas is:

df.std(axis=None, skipna=None, level=None, ddof=1, numeric_only=None, **kwargs)

std() Arguments

The std() method in Pandas has the following arguments:

  • axis (optional): the axis to operate on
  • skipna (optional): exclude NA/null values
  • ddof (optional): Delta Degrees of Freedom. The divisor used in calculations is N - ddof, where N represents the number of elements; default is 1
  • numeric_only (optional): include only float, int, boolean data

std() Return Value

The std() method returns:

  • A scalar, if applied to a single column of data.
  • A Series, if applied to multiple columns.

Example 1: Standard Deviation on a Single Column

import pandas as pd

data = {'A': [1, 3, 5, 7],
        'B': [2, 4, 6, 8]}
df = pd.DataFrame(data)

# calculate the standard deviation of one column
std_dev_column_a = df['A'].std()

print(std_dev_column_a)

Output

2.581988897471611

In this example, we calculated the standard deviation of the values in column A.


Example: Standard Deviation with Non-default ddof

import pandas as pd

data = {'A': [1, 3, 5, 7],
        'B': [2, 4, 6, 8]}
df = pd.DataFrame(data)

# calculate the standard deviation with ddof=0
std_dev_ddof_0 = df.std(ddof=0)

print(std_dev_ddof_0)

Output

A    2.236068
B    2.236068
dtype: float64

In this example, we set the ddof (Delta Degrees of Freedom) to 0 to change the divisor during the calculation from N - 1 to N, where N is the number of elements.


Example 3: Standard Deviation on DataFrame with NA Values

import pandas as pd

data = {'A': [1, 3, 5, None],
        'B': [2, 4, None, 8]}
df = pd.DataFrame(data)

# calculate the standard deviation while skipping NA values
std_dev_skipna = df.std(skipna=True)

print(std_dev_skipna)

Output

A    2.00000
B    3.05505
dtype: float64

Here, by setting skipna=True, the function skips over any NaN values present in the data when calculating the standard deviation.


Example 4: Standard Deviation of Rows

import pandas as pd

data = {'A': [1, 3, 5, 7],
        'B': [2, 4, 6, 8]}
df = pd.DataFrame(data)

# calculate the standard deviation with axis=1
std_dev_axis1 = df.std(axis=1)

print(std_dev_axis1)

Output

0    0.707107
1    0.707107
2    0.707107
3    0.707107
dtype: float64

In this example, we calculated the standard deviation of rows using the axis=1 argument.