Applying Lambda functions to Pandas Dataframe



If you’ve worked with pandas, you’re probably familiar with the apply method. This method allows you to apply a function to each element in a pandas DataFrame or Series. However, sometimes we need to apply more complex functions that cannot be written in a single line. This is where lambda functions come in handy.
In this article, we’ll explore how to apply lambda functions to a pandas DataFrame to perform more complex operations.

Consider the following Pandas DataFrame:

import pandas as pd

df = pd.DataFrame({
'A': [1, 2, 3, 4],
'B': [5, 6, 7, 8],
'C': [9, 10, 11, 12]
})

If we want to apply a lambda function to the values in column ‘A’, we can use the ‘apply’ method as follows:

Using Lambda Functions with apply()

Sure, here are some better examples:

Consider the following Pandas DataFrame:

import pandas as pd

df = pd.DataFrame({
'A': [1, 2, 3, 4],
'B': [5, 6, 7, 8],
'C': [9, 10, 11, 12]
})

If we want to apply a lambda function to the values in column ‘A’, we can use the ‘apply’ method as follows:

Using apply method with lambda function:

df['A'] = df['A'].apply(lambda x: x ** 2)
print(df)

Output:

A B C
0 1 5 9
1 4 6 10
2 9 7 11
3 16 8 12

Similarly, we can use the ‘applymap’ method to apply a lambda function to all the values in the DataFrame:

Using applymap method with lambda function:

df = df.applymap(lambda x: x ** 2)
print(df)

Output:

A B C
0 1 25 81
1 16 36 100
2 81 49 121
3 256 64 144

We can also use the ‘map’ method to apply a lambda function to a specific column:

Using map method with lambda function:

df['B'] = df['B'].map(lambda x: x / 5)
print(df)

Output:

A B C
0 1 5.0 81
1 16 7.2 100
2 81 9.8 121
3 256 12.8 144

We can also use the ‘apply’ method to apply a lambda function to a specific row:

Using apply method with axis=1:

df['sum'] = df.apply(lambda row: row['A'] + row['B'] + row['C'], axis=1)
print(df)

Output:

A B C sum
0 1 5.0 81 87.0
1 16 7.2 100 123.2
2 81 9.8 121 211.8
3 256 12.8 144 412.8

Code Example #1

import pandas as pd

# Create a sample dataframe
df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Ella'],
                   'Age': [25, 30, 35, 40, 45],
                   'Salary': [50000, 60000, 70000, 80000, 90000]})

# Apply a lambda function to create a new column
df['Salary with Bonus'] = df['Salary'].apply(lambda x: x * 1.1 if x < 70000 else x * 1.05)

# Apply a lambda function to filter rows
df_filtered = df[df.apply(lambda x: x['Age'] > 30 and x['Salary'] > 60000, axis=1)]

# Apply a lambda function to groupby and aggregate data
df_grouped = df.groupby(lambda x: 'Young' if x < 30 else 'Old').agg({'Salary': lambda x: x.mean()})

# Print the final dataframes
print("Dataframe with new Salary with Bonus column:\n", df)
print("Dataframe filtered by Age and Salary:\n", df_filtered)
print("Dataframe grouped by Age and aggregated by Salary:\n", df_grouped)

Output:

Dataframe with new Salary with Bonus column:
       Name  Age  Salary  Salary with Bonus
0    Alice   25   50000           55000.00
1      Bob   30   60000           66000.00
2  Charlie   35   70000           73500.00
3    David   40   80000           84000.00
4     Ella   45    90000          94500.00

Dataframe filtered by Age and Salary:
    Name  Age  Salary  Salary with Bonus
3  David   40   80000            84000.0
4   Ella   45   90000            94500.0

Dataframe grouped by Age and aggregated by Salary:
           Salary
Old  78333.333333
Young  55000.000000

Code Example #2

import pandas as pd

# create a sample dataframe
df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie', 'David'],
    'age': [25, 30, 35, 40],
    'income': [50000, 70000, 90000, 110000],
    'gender': ['F', 'M', 'M', 'M']
})

# apply a lambda function to calculate the tax for each person
df['tax'] = df['income'].apply(lambda x: x * 0.1)

# apply a lambda function to categorize people into different age groups
df['age_group'] = df['age'].apply(lambda x: 'young' if x < 30 else 'middle-aged' if x < 40 else 'old')

# apply a lambda function to assign a binary code to each gender
df['gender_code'] = df['gender'].apply(lambda x: 1 if x == 'M' else 0)

print(df)

       name  age  income gender      tax    age_group  gender_code
0     Alice   25   50000      F   5000.0        young            0
1       Bob   30   70000      M   7000.0  middle-aged            1
2   Charlie   35   90000      M   9000.0  middle-aged            1
3     David   40  110000      M  11000.0          old            1

Last Updated on May 16, 2023 by admin

Leave a Reply

Your email address will not be published. Required fields are marked *

Recommended Blogs