Filtering data with Pandas .query() method



Pandas query() method allows for more complex filtering of data using string expressions.

Pandas query method provides a powerful tool for filtering and selecting data in a DataFrame based on complex conditions. By using string expressions and variables, we can easily filter data based on any condition we require.

Syntax: DataFrame.query(expr, inplace=False, **kwargs)

Parameters:
expr: Expression in string form to filter data.
inplace: Make changes in the original data frame if True
kwargs: Other keyword arguments.

Return type: Filtered Data frame

Using .query() Method

The .query() method can be used to filter data based on a specific condition. The condition is specified as a string expression, similar to how you would specify a condition in SQL.

Let’s say we have a DataFrame containing information about customers, including their names, ages, and the products they have purchased:

	import pandas as pd

	data = {'Name': ['John', 'Emily', 'Ryan', 'Avery', 'Michael'],
	        'Age': [25, 32, 19, 43, 28],
	        'Product': ['A', 'B', 'C', 'B', 'A']}

	df = pd.DataFrame(data)

We can use the .query() method to filter the DataFrame to only include customers who are 30 years old or younger:

	df_filtered = df.query('Age <= 30')

The resulting DataFrame will only contain the rows where the age is 30 or less:

	Name      Age    Product
	-------------------------
	John      25     A
	Ryan      19     C
	Michael   28     A

Using Variables in the .query() Method

We can also use variables in the .query() method by wrapping them in the @ symbol. For example, let’s say we want to filter the DataFrame based on a variable that contains the maximum age:

	max_age = 30
	df_filtered = df.query('Age <= @max_age')

Last Updated on May 16, 2023 by admin

Leave a Reply

Your email address will not be published. Required fields are marked *

Recommended Blogs

Convert given Pandas series into a dataframe with its index as another column on the dataframeConvert given Pandas series into a dataframe with its index as another column on the dataframe



Convert given Pandas series into a dataframe with its index as another column on the dataframe First of all, let we understand that what are pandas series. Pandas Series are the type of array data structure. It is one dimensional data