The aggregate()
method in Pandas is used to perform summary computations on data, often on grouped data.
Example
import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3],
'B': [4, 5, 6]})
# apply sum to each column
result = df.aggregate('sum')
print(result)
'''
Output
A 6
B 15
dtype: int64
'''
aggregate() Syntax
The syntax of the aggregate()
method in Pandas is:
df.aggregate(func, axis=0, *args, **kwargs)
aggregate() Arguments
The aggregate()
method takes following arguments:
func
- an aggregate function likesum
,mean
, etc.axis
- specifies whether to apply the aggregation operation along rows or columns.*args
and**kwargs
- additional arguments that can be passed to the aggregation functions.
aggregate() Return Value
The aggregate()
method can return a single value, a Series, or a DataFrame, depending on the input data and the aggregation operations specified.
Example 1: Apply Single Aggregate Function
import pandas as pd
# create a DataFrame with 'Region' and 'Sales' columns
data = {
'Region': ['East', 'West', 'East', 'North', 'West', 'East', 'North', 'West'],
'Sales': [100, 200, 150, 120, 250, 175, 100, 300]
}
df = pd.DataFrame(data)
# calculate total sum of the Sales column
total_sales_sum = df['Sales'].aggregate('sum')
print("Total Sales Sum:", total_sales_sum)
# calculate the mean of the Sales column
average_sales = df['Sales'].aggregate('mean')
print("Average Sales:", average_sales)
# calculate the maximum value in the Sales column
max_sales = df['Sales'].aggregate('max')
print("Maximum Sales:", max_sales)
Output
Total Sales Sum: 1395 Average Sales: 174.375 Maximum Sales: 300
Here,
df['Sales'].aggregate('sum')
- calculates the total sum of theSales
column in the df DataFramedf['Sales'].aggregate('mean')
- calculates the mean (average) theSales
column in the df DataFramedf['Sales'].aggregate('max')
- computes the maximum value in theSales
column.
Example 2: Apply Multiple Aggregate Functions in Pandas
import pandas as pd
# create a DataFrame
data = {
'Product': ['Widget', 'Widget', 'Gadget', 'Gadget', 'Widget', 'Gadget'],
'Sales': [240, 350, 560, 470, 680, 590]
}
df = pd.DataFrame(data)
# group by the 'Product' column and aggregate the 'Sales' column
result = df.groupby('Product')['Sales'].agg(['sum', 'mean', 'max', 'min'])
print(result)
Output
sum mean max min
Product
Gadget 1620 540.000000 590 470
Widget 1270 423.333333 680 240
In the above example, we're using the aggregate()
function to apply multiple aggregation functions (sum
, mean
, max
, and min
) to the Sales
column after grouping by the Product
column.
The resulting DataFrame shows the calculated values for each category.
Example 3: Apply Different Aggregation Functions
import pandas as pd
data = {
'Type': ['X', 'X', 'Y', 'Y', 'X', 'Y'],
'Quantity': [100, 150, 200, 250, 300, 350],
'Price': [20, 30, 40, 50, 60, 70]
}
# create the DataFrame
df = pd.DataFrame(data)
# define aggregation functions for each column
agg_funcs = {
# applying 'sum' to Quantity column
'Quantity': 'sum',
# applying 'mean' and 'max' to Price column
'Price': ['mean', 'max']
}
# group by the 'Type' column and aggregate
result = df.groupby('Type').aggregate(agg_funcs)
print(result)
Output
Quantity Price
sum mean max
Type
X 550 36.666667 60
Y 800 53.333333 70
Here, we're using the aggregate()
function to apply different aggregation functions to different columns after grouping by the Type
column.
The resulting DataFrame shows the calculated values for each category and each specified aggregation function.
Example 4: Use of axis Argument in DataFrame Transposition
import pandas as pd
data = {
'Value1': [10, 15, 20, 25, 30, 35],
'Value2': [5, 8, 12, 15, 18, 21]
}
df = pd.DataFrame(data)
# apply the sum function column-wise (down the rows)
column_sum = df.aggregate('sum', axis=0)
print("Column-wise sum:")
print(column_sum)
print("\n")
# apply the sum function row-wise (across the columns)
row_sum = df.aggregate('sum', axis=1)
print("Row-wise sum:")
print(row_sum)
Output
Column-wise sum: Value1 135 Value2 79 dtype: int64 Row-wise sum: 0 15 1 23 2 32 3 40 4 48 5 56 dtype: int64
In the above example,
- column_sum computes the sum of the values within each column individually. For the column
Value1
, it adds up the numbers 10, 15, 20, 25, 30, and 35. - row_sum calculates the sum of the values across each row. For the first row, it adds the values in
Value1
andValue2
, which are 10 and 5, respectively.