Get unique values from a column in Pandas DataFrame



In this article, we will explore different methods to extract unique values from a column in a Pandas DataFrame. The unique values in a column can provide valuable insights and help in various data analysis tasks.

Method 1: Using the unique() Method

We can use the unique() method to obtain an array of unique values from a column in a DataFrame. Let’s consider the following example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({'Column1': ['A', 'B', 'A', 'C', 'B', 'D']})

# Get unique values from 'Column1'
unique_values = df['Column1'].unique()

# Print the unique values
print(unique_values)

Method 2: Using the drop_duplicates() Method

The drop_duplicates() method allows us to remove duplicate rows from a DataFrame, and we can extract the unique values from a specific column using this method. Consider the following example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({'Column1': ['A', 'B', 'A', 'C', 'B', 'D']})

# Get unique values from 'Column1'
unique_values = df['Column1'].drop_duplicates()

# Print the unique values
print(unique_values)

Method 3: Using the value_counts() Method

The value_counts() method provides the count of each unique value in a column. By retrieving the index of the resulting Series, we can obtain the unique values. Let’s see an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({'Column1': ['A', 'B', 'A', 'C', 'B', 'D']})

# Get unique values from 'Column1'
unique_values = df['Column1'].value_counts().index

# Print the unique values
print(unique_values)

Method 4: Using the set() Function

We can also convert the column values to a set, which automatically removes duplicates, and then convert it back to a list to obtain the unique values. Consider the following example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({'Column1': ['A', 'B', 'A', 'C', 'B', 'D']})

# Get unique values from 'Column1'
unique_values = list(set(df['Column1']))

# Print the unique values
print(unique_values)

Method 5: Using the groupby() Method

The groupby() method allows us to group the DataFrame by a specific column. By selecting the grouped column, we can obtain the unique values. Let’s see an example:

import pandas as pd

df = pd.DataFrame({'Column1': ['A', 'B', 'A', 'C', 'B', 'D']})

# Group by 'Column1' and get unique values

unique_values = df.groupby('Column1').groups.keys()

print(unique_values)

Method 6: Using the numpy.unique() Function

The numpy.unique() function is a powerful tool to extract unique values from an array-like object. We can pass the column values as an array to this function to obtain the unique values. Consider the following example:

import pandas as pd
import numpy as np

df = pd.DataFrame({'Column1': ['A', 'B', 'A', 'C', 'B', 'D']})

# Get unique values from 'Column1'

unique_values = np.unique(df['Column1'])

print(unique_values)

Method 7: Using the dropna() Method

If the column contains missing values (NaN), we can use the dropna() method to remove them and obtain the unique values. Let’s see an example:

import pandas as pd
# Create a sample DataFrame with missing values

df = pd.DataFrame({'Column1': ['A', 'B', 'A', None, 'B', 'D']})
# Remove missing values and get unique values from 'Column1'

unique_values = df['Column1'].dropna().unique()

print(unique_values)

Last Updated on May 17, 2023 by admin

Leave a Reply

Your email address will not be published. Required fields are marked *

Recommended Blogs