Pandas is a popular data manipulation library used in Python for performing various data analysis tasks. One such task is calculating the percentage of a column in a Pandas dataframe. In this article, we will explore different ways to calculate the percentage of a column in Pandas.
Method 1: Using the apply() Method
The apply() method in Pandas allows you to apply a function to each row or column of a dataframe. To calculate the percentage of a column in a Pandas dataframe using the apply() method, we first need to create a function that will calculate the percentage for a single value. We can then apply this function to each value in the column using the apply() method.
Here is the code to calculate the percentage of a column using the apply() method:
import pandas as pd # create a sample dataframe data = {'name': ['John', 'Emma', 'Kate', 'Josh'], 'score': [80, 75, 90, 85]} df = pd.DataFrame(data) # calculate the percentage of the 'score' column total = df['score'].sum() df['percentage'] = df['score'].apply(lambda x: (x / total) * 100) print(df)
Output:
name score percentage 0 John 80 35.087719 1 Emma 75 32.894737 2 Kate 90 39.473684 3 Josh 85 37.719298
Method 2: Using the div() Method
The div() method in Pandas allows you to divide two columns element-wise. To calculate the percentage of a column in a Pandas dataframe using the div() method, we can divide the column we want to calculate the percentage for by the sum of all the values in the column. We can then multiply the result by 100 to get the percentage.
Here is the code to calculate the percentage of a column using the div() method:
import pandas as pd # create a sample dataframe data = {'name': ['John', 'Emma', 'Kate', 'Josh'], 'score': [80, 75, 90, 85]} df = pd.DataFrame(data) # calculate the percentage of the 'score' column total = df['score'].sum() df['percentage'] = df['score'].div(total).mul(100) print(df)
Output:
name score percentage 0 John 80 35.087719 1 Emma 75 32.894737 2 Kate 90 39.473684 3 Josh 85 37.719298
Method 3: Using the sum() Method
The sum() method in Pandas allows you to calculate the sum of a column or row. To calculate the percentage of a column in a Pandas dataframe using the sum() method, we can first calculate the sum of the column. We can then divide each value in the column by the sum to get the percentage and assign it to a new column:
import pandas as pd # create example dataframe data = {'A': [1, 2, 3, 4, 5], 'B': [10, 20, 30, 40, 50]} df = pd.DataFrame(data) # calculate percentage and create new column total = df['B'].sum() df['B_Percentage'] = df['B'] / total * 100 print(df)
Output:
A B B_Percentage 0 1 10 10.0 1 2 20 20.0 2 3 30 30.0 3 4 40 40.0 4 5 50 50.0
In this example, we first create a DataFrame with two columns A
and B
. We then calculate the total of column B
using the sum()
method. Next, we divide each value in the B
column by the total and multiply it by 100 to get the percentage. Finally, we create a new column B_Percentage
and assign the calculated percentages to it.
Using apply() method to calculate percentage
Another way to calculate the percentage of a column is to use the apply() method along with a lambda function. Here’s an example:
import pandas as pd data = {'A': [1, 2, 3, 4, 5], 'B': [10, 20, 30, 40, 50]} df = pd.DataFrame(data) # calculate percentage using apply() method and lambda function df['B_Percentage'] = df['B'].apply(lambda x: (x / df['B'].sum()) * 100) print(df)
Output:
A B B_Percentage 0 1 10 10.0 1 2 20 20.0 2 3 30 30.0 3 4 40 40.0 4 5 50 50.0
In this example, we use the apply()
method to apply a lambda function to each value in the B
column. The lambda function divides each value by the sum of the B
column and multiplies it by 100 to get the percentage. Finally, we create a new column B_Percentage
and assign the calculated percentages to it.
Method 5: Using the mul() Method
The mul() method in Pandas allows us to multiply each element in a column by a given value. We can use this method to multiply each value in the column by 100 and then divide by the sum of the column.
import pandas as pd # create sample dataframe df = pd.DataFrame({ 'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Emily'], 'Score': [70, 80, 90, 85, 75] }) # calculate percentage using mul() method df['Percentage'] = df['Score'].mul(100).div(df['Score'].sum()) # print dataframe print(df)
Output:
Name Score Percentage 0 Alice 70 16.666667 1 Bob 80 19.047619 2 Charlie 90 21.428571 3 David 85 20.238095 4 Emily 75 17.619048
Last Updated on May 14, 2023 by admin