Filtering data with Pandas .query() method

Pandas query() method allows for more complex filtering of data using string expressions.

Pandas query method provides a powerful tool for filtering and selecting data in a DataFrame based on complex conditions. By using string expressions and variables, we can easily filter data based on any condition we require.

Syntax: DataFrame.query(expr, inplace=False, **kwargs)

expr: Expression in string form to filter data.
inplace: Make changes in the original data frame if True
kwargs: Other keyword arguments.

Return type: Filtered Data frame

Using .query() Method

The .query() method can be used to filter data based on a specific condition. The condition is specified as a string expression, similar to how you would specify a condition in SQL.

Let’s say we have a DataFrame containing information about customers, including their names, ages, and the products they have purchased:

	import pandas as pd

	data = {'Name': ['John', 'Emily', 'Ryan', 'Avery', 'Michael'],
	        'Age': [25, 32, 19, 43, 28],
	        'Product': ['A', 'B', 'C', 'B', 'A']}

	df = pd.DataFrame(data)

We can use the .query() method to filter the DataFrame to only include customers who are 30 years old or younger:

	df_filtered = df.query('Age <= 30')

The resulting DataFrame will only contain the rows where the age is 30 or less:

	Name      Age    Product
	John      25     A
	Ryan      19     C
	Michael   28     A

Using Variables in the .query() Method

We can also use variables in the .query() method by wrapping them in the @ symbol. For example, let’s say we want to filter the DataFrame based on a variable that contains the maximum age:

	max_age = 30
	df_filtered = df.query('Age <= @max_age')

Last Updated on May 16, 2023 by admin

Leave a Reply

Your email address will not be published. Required fields are marked *

Recommended Blogs