How to calculate the Percentage of a column in Pandas ?



Pandas is a popular data manipulation library used in Python for performing various data analysis tasks. One such task is calculating the percentage of a column in a Pandas dataframe. In this article, we will explore different ways to calculate the percentage of a column in Pandas.

Method 1: Using the apply() Method

The apply() method in Pandas allows you to apply a function to each row or column of a dataframe. To calculate the percentage of a column in a Pandas dataframe using the apply() method, we first need to create a function that will calculate the percentage for a single value. We can then apply this function to each value in the column using the apply() method.

Here is the code to calculate the percentage of a column using the apply() method:

import pandas as pd
# create a sample dataframe

data = {'name': ['John', 'Emma', 'Kate', 'Josh'],
                'score': [80, 75, 90, 85]}
df = pd.DataFrame(data)
# calculate the percentage of the 'score' column

total = df['score'].sum()
df['percentage'] = df['score'].apply(lambda x: (x / total) * 100)
print(df)

Output:

name score percentage
0 John 80 35.087719
1 Emma 75 32.894737
2 Kate 90 39.473684
3 Josh 85 37.719298

Method 2: Using the div() Method

The div() method in Pandas allows you to divide two columns element-wise. To calculate the percentage of a column in a Pandas dataframe using the div() method, we can divide the column we want to calculate the percentage for by the sum of all the values in the column. We can then multiply the result by 100 to get the percentage.

Here is the code to calculate the percentage of a column using the div() method:

import pandas as pd
# create a sample dataframe

data = {'name': ['John', 'Emma', 'Kate', 'Josh'],
                 'score': [80, 75, 90, 85]}
df = pd.DataFrame(data)
# calculate the percentage of the 'score' column

total = df['score'].sum()
df['percentage'] = df['score'].div(total).mul(100)
print(df)

Output:

name score percentage
0 John 80 35.087719
1 Emma 75 32.894737
2 Kate 90 39.473684
3 Josh 85 37.719298

Method 3: Using the sum() Method

The sum() method in Pandas allows you to calculate the sum of a column or row. To calculate the percentage of a column in a Pandas dataframe using the sum() method, we can first calculate the sum of the column. We can then divide each value in the column by the sum to get the percentage and assign it to a new column:

import pandas as pd
# create example dataframe

data = {'A': [1, 2, 3, 4, 5], 'B': [10, 20, 30, 40, 50]}
df = pd.DataFrame(data)
# calculate percentage and create new column

total = df['B'].sum()
df['B_Percentage'] = df['B'] / total * 100

print(df)

Output:

   A   B  B_Percentage
0  1  10          10.0
1  2  20          20.0
2  3  30          30.0
3  4  40          40.0
4  5  50          50.0

 

In this example, we first create a DataFrame with two columns A and B. We then calculate the total of column B using the sum() method. Next, we divide each value in the B column by the total and multiply it by 100 to get the percentage. Finally, we create a new column B_Percentage and assign the calculated percentages to it.

Using apply() method to calculate percentage

Another way to calculate the percentage of a column is to use the apply() method along with a lambda function. Here’s an example:

import pandas as pd

data = {'A': [1, 2, 3, 4, 5], 'B': [10, 20, 30, 40, 50]}
df = pd.DataFrame(data)
# calculate percentage using apply() method and lambda function

df['B_Percentage'] = df['B'].apply(lambda x: (x / df['B'].sum()) * 100)

print(df)

Output:

   A   B  B_Percentage
0  1  10          10.0
1  2  20          20.0
2  3  30          30.0
3  4  40          40.0
4  5  50          50.0

 

In this example, we use the apply() method to apply a lambda function to each value in the B column. The lambda function divides each value by the sum of the B column and multiplies it by 100 to get the percentage. Finally, we create a new column B_Percentage and assign the calculated percentages to it.

Method 5: Using the mul() Method

The mul() method in Pandas allows us to multiply each element in a column by a given value. We can use this method to multiply each value in the column by 100 and then divide by the sum of the column.

import pandas as pd

# create sample dataframe
df = pd.DataFrame({
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Emily'],
    'Score': [70, 80, 90, 85, 75]
})

# calculate percentage using mul() method
df['Percentage'] = df['Score'].mul(100).div(df['Score'].sum())

# print dataframe
print(df)


Output:

       Name  Score  Percentage
0     Alice     70   16.666667
1       Bob     80   19.047619
2   Charlie     90   21.428571
3     David     85   20.238095
4     Emily     75   17.619048

Last Updated on May 14, 2023 by admin

Leave a Reply

Your email address will not be published. Required fields are marked *

Recommended Blogs