Pandas assign()

The assign() method in Pandas is used to create a new column in a DataFrame or modify an existing one.

Example

import pandas as pd

# sample DataFrame
data = {'A': [1, 2, 3],
        'B': [4, 5, 6]}

df = pd.DataFrame(data)

# create new column C
new_df = df.assign(C=[7, 8, 9])

print(new_df)

'''
Output

   A  B  C
0  1  4  7
1  2  5  8
2  3  6  9
'''

assign() Syntax

The syntax of the assign() method in Pandas is:

df.assign(**kwargs)

assign() Argument

The assign() method takes the following argument:

  • **kwargs: the column names and their corresponding values or functions.

assign() Return Value

The assign() method returns a new DataFrame with the assigned columns. The original DataFrame remains unchanged.


Example 1: Basic Column Assignment

import pandas as pd

data = {'A': [1, 2, 3]}

df = pd.DataFrame(data)

# assign a new column B
new_df = df.assign(B=[4, 5, 6])

print(new_df)

Output

   A  B
0  1  4
1  2  5
2  3  6

In this example, we assigned column B to df and displayed the resulting DataFrame.


Example 2: Assignment Using Functions

We can assign columns based on the values in the existing DataFrame using functions.

import pandas as pd

data = {'A': [1, 2, 3]}

df = pd.DataFrame(data)

# assign a new column B based on column A
new_df = df.assign(B=lambda x: x['A'] * 2)

print(new_df)

Output

   A  B
0  1  2
1  2  4
2  3  6

In this example, we assigned values to the new column B that are double the values in column A using lambda function.


Example 3: Multiple Column Assignments

We can assign multiple columns at once using the assign() method.

import pandas as pd

data = {'A': [1, 2, 3]}
df = pd.DataFrame(data)

# assign multiple new columns
new_df = df.assign(B=[4, 5, 6], C=[7, 8, 9])

print(new_df)

Output

   A  B  C
0  1  4  7
1  2  5  8
2  3  6  9

Example 4: Chaining Assignments

We can chain the assign() method to assign multiple columns.

import pandas as pd

data = {'A': [1, 2, 3]}

df = pd.DataFrame(data)

# chain assignments to add columns
new_df = df.assign(B=[4, 5, 6]).assign(C=lambda x: x['A'] + x['B'])

print(new_df)

Output

   A  B  C
0  1  4  5
1  2  5  7
2  3  6  9

In this example, we first assigned column B. In the next assign() call, we used the newly created B and existing A to assign column C.