Pandas DataFrame reset_index() method



Pandas DataFrame reset_index() Method with Exmaples

Pandas reset_index() method is a powerful tool to reset the index of a DataFrame. This method sets a list of integers ranging from 0 to the length of the data as the index. In this article, we will discuss the reset_index() method in detail and provide code examples.

 

Syntax:
DataFrame.reset_index(level=None, drop=False, inplace=False, col_level=0, col_fill=””)

Parameters:
level: int, string or a list to select and remove passed column from index.
drop: Boolean value, Adds the replaced index column to the data if False.
inplace: Boolean value, make changes in the original data frame itself if True.
col_level: Select in which column level to insert the labels.
col_fill: Object, to determine how the other levels are named.

Return type: DataFrame

Example 1: Resetting Index

In this example, we will create a DataFrame with three columns and set the Name column as the index. We will then use the reset_index() method to reset the index and generate a new index.

# importing pandas package
import pandas as pd

# create a DataFrame
data = pd.DataFrame({
    'Name': ['John', 'Anna', 'Mark', 'Sophia', 'David'],
    'Age': [24, 31, 45, 27, 33],
    'City': ['New York', 'London', 'Paris', 'Berlin', 'Sydney']
})

# set the Name column as the index
data.set_index('Name', inplace=True)

# reset index
data.reset_index(inplace=True)

# display the DataFrame
print(data)

Output:

     Name  Age      City
0    John   24  New York
1    Anna   31    London
2    Mark   45     Paris
3  Sophia   27    Berlin
4   David   33    Sydney

As shown in the output, a new index label named index has been generated.

Example 2: Operation on Multi-level Index

In this example, we will create a DataFrame with three columns and set Name and Gender columns as the index. We will then use the reset_index() method to remove the Gender column from the index.

# create a DataFrame
data = pd.DataFrame({
    'Name': ['John', 'Anna', 'Mark', 'Sophia', 'David'],
    'Age': [24, 31, 45, 27, 33],
    'Gender': ['Male', 'Female', 'Male', 'Female', 'Male'],
    'City': ['New York', 'London', 'Paris', 'Berlin', 'Sydney']
})

# set Name and Gender columns as the index
data.set_index(['Name', 'Gender'], inplace=True)

# reset index
data.reset_index(level='Gender', inplace=True)

# display the DataFrame
print(data)

Output:

    Gender    Age      City
Name                       
John   Male   24.0  New York
Anna   Female 31.0    London
Mark   Male   45.0     Paris
Sophia Female 27.0    Berlin
David  Male   33.0    Sydney

As shown in the output, the Gender column in the index was removed, and the data is now indexed by Name.

Example #3: Using drop parameter to keep original index column

In this example, we will use the drop parameter to keep the original index column in the data frame. We will use the same data frame as in Example #1.

import pandas as pd

# Create a sample data frame
data = {'Name': ['John', 'Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 35, 40],
        'Salary': [50000, 60000, 70000, 80000]}
df = pd.DataFrame(data)

# Set the Name column as index
df.set_index('Name', inplace=True)

# Reset the index while keeping the original Name column
df.reset_index(drop=False, inplace=True)

# Display the data frame
print(df)

Output:

      Name  Age  Salary
0     John   25   50000
1    Alice   30   60000
2      Bob   35   70000
3  Charlie   40   80000

 

In the reset_index() method, we set the drop parameter to False, which means that the original index column (Name in this case) is kept in the data frame.

Example #4: Using level parameter with list of columns

In this example, we will use the level parameter with a list of columns to reset the index for a multi-level index data frame.

import pandas as pd

# Create a sample data frame with multi-level index
arrays = [['A', 'A', 'B', 'B'], [1, 2, 1, 2]]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['First', 'Second'])
df = pd.DataFrame({'Value': [10, 20, 30, 40]}, index=index)

# Display the data frame with multi-level index
print(df)

# Reset the index for First and Second columns
df.reset_index(level=['First', 'Second'], inplace=True)

# Display the data frame with reset index
print(df)

Output:

             Value
First Second       
A     1         10
      2         20
B     1         30
      2         40
  First  Second  Value
0     A       1     10
1     A       2     20
2     B       1     30
3     B       2     40

In the reset_index() method, we set the level parameter to a list of columns (First and Second in this case), which resets the index for those columns.

Example #5: Using col_level parameter to specify the column level

In this example, we will use the col_level parameter to specify the column level for the new index labels.

import pandas as pd

# Create a sample data frame with multi-level index
arrays = [['A', 'A', 'B', 'B'], [1, 2, 1, 2]]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['First', 'Second'])
df = pd.DataFrame({'Value': [10, 20, 30, 40]}, index=index)

# Display the data frame with multi-level index
print(df)

# Reset the index and set the column level to 1
df.reset_index(level=['First', 'Second'], inplace=True, col_level=1)

# Display the data frame with reset index
print(df)

Output:

        First Name   Age  Gender
0            Jason  42.0    Male
1        Gabrielle  38.0  Female
2            Derek  40.0    Male
3          Darlene  36.0  Female
4             Lily  24.0  Female
5           Joseph  32.0    Male
6            Becky  29.0  Female
7            Harry  45.0    Male
8           Albert  30.0    Male
9            Sarah  27.0  Female
10         Patrick  33.0    Male
11           Linda  50.0  Female
12           James  44.0    Male
13            Luke  28.0    Male
14            Anna  31.0  Female

Example #6: Resetting index with drop=False

In this example, we will use the same dataframe and reset the index with drop=False parameter. This parameter adds the replaced index column to the data if False.

import pandas as pd

# create dataframe
data = {'First Name': ['Jason', 'Gabrielle', 'Derek', 'Darlene', 'Lily', 'Joseph',
                       'Becky', 'Harry', 'Albert', 'Sarah', 'Patrick', 'Linda',
                       'James', 'Luke', 'Anna'],
        'Age': [42, 38, 40, 36, 24, 32, 29, 45, 30, 27, 33, 50, 44, 28, 31],
        'Gender': ['Male', 'Female', 'Male', 'Female', 'Female', 'Male', 'Female',
                   'Male', 'Male', 'Female', 'Male', 'Female', 'Male', 'Male', 'Female']}

df = pd.DataFrame(data)

# set first name as index column
df.set_index('First Name', inplace=True)

# reset index with drop=False
df = df.reset_index(drop=False)

print(df)

Output:

   First Name  Age  Gender
0        Jason   42    Male
1    Gabrielle   38  Female
2        Derek   40    Male
3      Darlene   36  Female
4         Lily   24  Female
5       Joseph   32    Male
6        Becky   29  Female
7        Harry   45    Male
8       Albert   30    Male
9        Sarah   27  Female
10     Patrick   33    Male
11       Linda   50  Female
12       James   44    Male
13        Luke   28    Male
14        Anna   31  Female

 

Example #7: Resetting index with multiple levels

In this example, we create a DataFrame with two levels of columns and two levels of index, and then reset the index.

import pandas as pd
  
# creating a multi-level DataFrame
arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
          ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
df = pd.DataFrame({'A': [1, 2, 3, 4, 5, 6, 7, 8],
                   'B': [10, 20, 30, 40, 50, 60, 70, 80],
                   'C': [100, 200, 300, 400, 500, 600, 700, 800],
                   'D': [1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000]},
                  index=index)
  
# display the DataFrame
print("Original DataFrame:")
print(df)
  
# reset the index
df_reset = df.reset_index()
  
# display the modified DataFrame
print("\nDataFrame after resetting the index:")
print(df_reset)

Output:

Original DataFrame:
             A   B    C     D
first second                
bar   one     1  10  100  1000
      two     2  20  200  2000
baz   one     3  30  300  3000
      two     4  40  400  4000
foo   one     5  50  500  5000
      two     6  60  600  6000
qux   one     7  70  700  7000
      two     8  80  800  8000
  
DataFrame after resetting the index:
  first second  A   B    C     D
0   bar    one  1  10  100  1000
1   bar    two  2  20  200  2000
2   baz    one  3  30  300  3000
3   baz    two  4  40  400  4000
4   foo    one  5  50  500  5000
5   foo    two  6  60  600  6000
6   qux    one  7  70  700  7000
7   qux    two  8  80  800  8000

In this example, we create a DataFrame with two levels of columns and two levels of index, and then reset the index. The reset_index() method moves the index back to a column and creates a new sequential integer index.

Note that in this example, we first create a MultiIndex for both the rows and columns of the DataFrame using the pd.MultiIndex.from_tuples() method. We then use this MultiIndex to create the DataFrame.

Last Updated on May 4, 2023 by admin

Leave a Reply

Your email address will not be published. Required fields are marked *

Recommended Blogs