Ways to filter Pandas DataFrame by column values



Ways to filter Pandas DataFrame by column values

In this post, we will see different ways to filter Pandas Dataframe by column values. First, Let’s create a Dataframe:

# importing pandas 
import pandas as pd 
   
# declare a dictionary
record =
 
 'Name' : ['Ankit', 'Swapnil', 'Aishwarya'
          'Priyanka', 'Shivangi', 'Shaurya' ],
   
 'Age' : [22, 20, 21, 19, 18, 22], 
   
 'Stream' : ['Math', 'Commerce', 'Science'
            'Math', 'Math', 'Science'], 
   
 'Percentage' : [90, 90, 96, 75, 70, 80] } 
   
# create a dataframe 
dataframe = pd.DataFrame(record,
                         columns = ['Name', 'Age'
                                    'Stream', 'Percentage']) 
# show the Dataframe
print("Given Dataframe :\n", dataframe)

Output:

Dataframe

Method 1: Selecting rows of Pandas Dataframe based on particular column value using ‘>’, ‘=’, ‘=’, ‘<=’, ‘!=’ operator.

 

 

Example 1: Selecting all the rows from the given Dataframe in which ‘Percentage’ is greater than 75 using [ ].

# selecting rows based on condition 
rslt_df = dataframe[dataframe['Percentage'] > 70
   
print('\nResult dataframe :\n', rslt_df)

Output:

output dataframe

Example 2: Selecting all the rows from the given Dataframe in which ‘Percentage’ is greater than 70 using loc[ ].

# selecting rows based on condition 
rslt_df = dataframe.loc[dataframe['Percentage'] > 70
   
print('\nResult dataframe :\n'
      rslt_df)

Output:

output dataframe-1

 

Method 2: Selecting those rows of Pandas Dataframe whose column value is present in the list using isin() method of the dataframe.

Example 1: Selecting all the rows from the given dataframe in which ‘Stream’ is present in the options list using [ ].

options = ['Science', 'Commerce'
   
# selecting rows based on condition 
rslt_df = dataframe[dataframe['Stream'].isin(options)] 
   
print('\nResult dataframe :\n',
      rslt_df)

Output:

output dataframe-2

Example 2: Selecting all the rows from the given dataframe in which ‘Stream’ is present in the options list using loc[ ].

options = ['Science', 'Commerce'
   
# selecting rows based on condition 
rslt_df = dataframe.loc[dataframe['Stream'].isin(options)] 
   
print('\nResult dataframe :\n'
      rslt_df)

Output:

output dataframe-3

Method 3: Selecting rows of  Pandas Dataframe based on multiple column conditions using ‘&’ operator.

 

Example1: Selecting all the rows from the given Dataframe in which ‘Age’ is equal to 22 and ‘Stream’ is present in the options list using [ ].

options = ['Commerce' ,'Science'
   
# selecting rows based on condition 
rslt_df = dataframe[(dataframe['Age'] == 22) & 
          dataframe['Stream'].isin(options)] 
   
print('\nResult dataframe :\n',
      rslt_df)

Output:

output dataframe-4

Example 2: Selecting all the rows from the given Dataframe in which ‘Age’ is equal to 22 and ‘Stream’ is present in the options list using loc[ ].

options = ['Commerce', 'Science'
  
# selecting rows based on condition 
rslt_df = dataframe.loc[(dataframe['Age'] == 22) & 
              dataframe['Stream'].isin(options)] 
   
print('\nResult dataframe :\n',
      rslt_df)

Output:

output dataframe-5

 

Last Updated on October 28, 2021 by admin

Leave a Reply

Your email address will not be published. Required fields are marked *

Recommended Blogs