Pandas loc[]

The loc[] property in Pandas is used to select data from a DataFrame based on labels or conditions.

Example

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 35],
        'City': ['New York', 'Los Angeles', 'Chicago']}

df = pd.DataFrame(data)

# select the row with label 1 (second row) selected_row = df.loc[1]
print(selected_row) ''' Output Name Bob Age 30 City Los Angeles Name: 1, dtype: object '''

loc[] Syntax

The syntax of the loc[] property in Pandas is:

loc[rows, columns]

loc[] Arguments

The loc[] property takes following arguments:

  • rows - specifies data selection criteria for rows, which can be labels, boolean conditions, or slices
  • columns - specifies selection criteria for columns, which can be labels, boolean conditions, or slices.

loc[] Return Value

The loc[] property in Pandas returns a DataFrame, depending on how we use it and what we're selecting.


Example 1: Select Single Row by Label

import pandas as pd

# create a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 28],
        'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']}

df = pd.DataFrame(data, index=['A', 'B', 'C', 'D'])

# select a single row by label selected_row = df.loc['A']
print(selected_row)

Output

Name    Alice
Age     25
City    New York
Name: A, dtype: object

In the above example, we have used the loc[] property to select a single row from the DataFrame.

The index parameter specifies custom row labels A, B, C, and D for the DataFrame.

And the A label is used as the argument to loc[], which specifies that we want to select the row with the label A.


Example 2: Select Multiple Rows by Labels

import pandas as pd

# create a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 28],
        'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']}

df = pd.DataFrame(data, index=['A', 'B', 'C', 'D'])

# select multiple rows by labels selected_rows = df.loc[['A', 'C']]
print(selected_rows)

Output

      Name  Age      City
A    Alice   25  New York
C  Charlie   35   Chicago

Here, first we created the df DataFrame with row labels A, B, C, and D. Then, we used loc[['A', 'C']] to select multiple rows by the labels A and C.

Hence, the selected_rows DataFrame contains the rows with labels A and C.


Example 3: Select Specific Rows and Columns

import pandas as pd

# create a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 28],
        'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']}

df = pd.DataFrame(data, index=['A', 'B', 'C', 'D'])

# select specific rows and columns selected_data = df.loc[['A', 'C'], ['Name', 'Age']]
print(selected_data)

Output

      Name  Age
A    Alice   25
C  Charlie   35

In the above example, we first created the df DataFrame with row labels A, B, C, and D and columns Name, Age, and City.

  • To select specific rows, we pass a list of row labels A and C as the first argument to the loc[] property.
  • To select specific columns, we pass a list of column names Name and Age as the second argument to the loc[] property.

Hence, the output shows the selected rows A and C with the columns Name and Age.


Example 4: Slice Rows and Select Specific Columns

import pandas as pd

# create a sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 28],
        'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']}

df = pd.DataFrame(data, index=['A', 'B', 'C', 'D'])

# slice rows and select specific columns selected_data = df.loc['B':'C', ['Name', 'Age']]
print(selected_data)

Output

      Name  Age
B      Bob   30
C  Charlie   35

In this example, the loc['B':'C', ['Name', 'Age']] property slices rows from B to C, inclusive, and selects specific columns Name and Age.

So, selected_data includes rows B and C and only the columns Name and Age.

Note: To learn more about how slicing works, please visit Pandas Indexing and Slicing.


Example 5: Select all Rows for Specific Columns

import pandas as pd

# create a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 28],
        'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']}

df = pd.DataFrame(data, index=['A', 'B', 'C', 'D'])

# select all rows for specific columns selected_columns = df.loc[:, ['Name', 'Age']]
print(selected_columns)

Output

      Name  Age
A    Alice   25
B      Bob   30
C  Charlie   35
D    David   28

Here, the loc[:, ['Name', 'Age']] property selects all rows from the DataFrame with only the specified columns Name and Age.

This will give us a DataFrame containing all rows for the Name and Age columns.


Example 6: Select Specific Rows for all Columns

import pandas as pd

data = {'Student_ID': [101, 102, 103, 104, 105],
        'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
        'Score': [85, 92, 78, 88, 95]}

df = pd.DataFrame(data)

# select rows 1 and 3 for all columns selected_rows = df.loc[[1, 3], :]
print(selected_rows)

Output

       Student_ID  Name  Score
1         102      Bob       92
3         104      David     88

Here, loc[[1, 3], :] allows us to select the first and third row while including all columns.


Example 7: Select Rows by Boolean Condition

import pandas as pd

# create a sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 28],
        'City': ['New York', 'Denver', 'Chicago', 'Houston']}

df = pd.DataFrame(data)

# select rows where Age is greater than or equal 30 selected_rows = df.loc[df['Age'] >= 30]
print(selected_rows)

Output

       Name  Age         City
1      Bob    30        Denver
2  Charlie   35      Chicago

Here, we have used the loc[df['Age'] >= 30] property to select rows from the df DataFrame where the Age column has a value greater than or equal to 30.

The selected_rows DataFrame will display the rows where the Age is greater than or equal to 30.