In this tutorial we will discuss how to drop one or multiple columns in Pandas Dataframe. Drop one or more than one columns from a DataFrame can be done in multiple ways. Let’s see multiple examples to drop one or multiple columns in Pandas Dataframe.
Dropping a Single Column in Pandas
To drop a single column from a Pandas Dataframe, you can use the drop()
function. The syntax for dropping a single column is as follows:
df.drop('column_name', axis=1, inplace=True)
The 'column_name'
parameter specifies the name of the column you want to drop. The axis=1
parameter specifies that the column is to be dropped. The inplace=True
parameter specifies that the changes made to the Dataframe are to be saved.
Dropping Multiple Columns in Pandas
To drop multiple columns from a Pandas Dataframe, you can use the drop()
function with a list of column names. The syntax for dropping multiple columns is as follows:
df.drop(['column_name1', 'column_name2', 'column_name3'], axis=1, inplace=True)
The list ['column_name1', 'column_name2', 'column_name3']
specifies the names of the columns you want to drop. The axis=1
parameter specifies that the columns are to be dropped. The inplace=True
parameter specifies that the changes made to the Dataframe are to be saved.
Dropping columns based on a specific condition
import pandas as pd # create a sample dataframe data = {'name': ['John', 'Jane', 'Peter', 'Mary'], 'age': [30, 25, 45, 20], 'gender': ['M', 'F', 'M', 'F']} df = pd.DataFrame(data) # drop columns where age is less than 30 df.drop(df[df['age'] < 30].index, axis=1, inplace=True) print(df)
Output:
name age 0 John 30 2 Peter 45
Dropping columns with a specific prefix
import pandas as pd # create a sample dataframe data = {'A_col1': [1, 2, 3, 4], 'A_col2': [5, 6, 7, 8], 'B_col1': [9, 10, 11, 12], 'B_col2': [13, 14, 15, 16]} df = pd.DataFrame(data) # drop columns that start with 'B_' df = df[df.columns.drop(list(df.filter(regex='^B_')))] print(df)
Output:
A_col1 A_col2 0 1 5 1 2 6 2 3 7 3 4 8
Dropping multiple columns by position
import pandas as pd # create a sample dataframe data = {'name': ['John', 'Jane', 'Peter', 'Mary'], 'age': [30, 25, 45, 20], 'gender': ['M', 'F', 'M', 'F']} df = pd.DataFrame(data) # drop the 1st and 3rd columns df.drop(df.columns[[0, 2]], axis=1, inplace=True) print(df)
Output:
age 0 30 1 25 2 45 3 20
Drop columns based on a list of values
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}) # Drop columns B and C df.drop(['B', 'C'], axis=1, inplace=True)
Drop columns based on the type of data
df = pd.DataFrame({'A': [1, 2, 3], 'B': ['four', 'five', 'six'], 'C': [7.0, 8.0, 9.0]}) # Drop columns with non-numeric data df.select_dtypes(include=[np.number]).head()
This will only keep columns with numeric data and drop the other columns.
Last Updated on May 11, 2023 by admin