Creating Pandas DataFrame from Lists



In this article, we will explore how to create a Pandas DataFrame from lists in Python. Pandas is a powerful library for data manipulation and analysis, and it provides a convenient way to work with tabular data through its DataFrame object.

Step 1: Importing the Pandas Library

To get started, we need to import the Pandas library. This can be done using the following code:

import pandas as pd

Step 2: Creating Lists

Next, we need to create the lists that will serve as the data for our DataFrame. Let’s say we have three lists representing the names, ages, and salaries of a group of employees:

names = ['Alice', 'Bob', 'Charlie', 'David']
ages = [25, 32, 28, 41]
salaries = [50000, 70000, 60000, 80000]

Step 3: Creating the DataFrame

Now, we can create the DataFrame using the pd.DataFrame() method. We pass the lists as a dictionary, where the keys represent the column names:

data = {'Name': names, 'Age': ages, 'Salary': salaries}
df = pd.DataFrame(data)

Step 4: Displaying the DataFrame

To see the contents of the DataFrame, we can simply print it:

print(df)


Output:

      Name  Age  Salary
0    Alice   25   50000
1      Bob   32   70000
2  Charlie   28   60000
3    David   41   80000
  

Step 5: Accessing DataFrame Elements

We can access specific elements of the DataFrame using various indexing techniques. For example, to retrieve the name of the second employee, we can use the following code:

name = df.loc[1, 'Name']
print(name)

Output:


Bob
  

Step 6: Performing Operations on DataFrame

Pandas provides a wide range of operations that can be performed on DataFrames. For instance, we can calculate the average age of the employees using the mean() method:

average_age = df['Age'].mean

()
print(average_age)

Output:


31.5
 

Create a simple DataFrame from three lists: names, ages, and salaries.

import pandas as pd

names = ['Alice', 'Bob', 'Charlie']
ages = [25, 32, 28]
salaries = [50000, 70000, 60000]

data = {'Name': names, 'Age': ages, 'Salary': salaries}
df = pd.DataFrame(data)

print(df)

Output:


     Name  Age  Salary
0   Alice   25   50000
1     Bob   32   70000
2  Charlie   28   60000

Create a DataFrame with additional columns and rows.

import pandas as pd

names = ['Alice', 'Bob', 'Charlie']
ages = [25, 32, 28]
salaries = [50000, 70000, 60000]
departments = ['Sales', 'Marketing', 'Finance']

data = {'Name': names, 'Age': ages, 'Salary': salaries, 'Department': departments}
df = pd.DataFrame(data)

print(df)


Output:


     Name  Age  Salary Department
0   Alice   25   50000      Sales
1     Bob   32   70000  Marketing
2  Charlie   28   60000    Finance

Create a DataFrame with specified column names and index.

import pandas as pd

data = [['Alice', 25, 50000], ['Bob', 32, 70000], ['Charlie', 28, 60000]]
columns = ['Name', 'Age', 'Salary']
index = ['Employee 1', 'Employee 2', 'Employee 3']

df = pd.DataFrame(data, columns=columns, index=index)

print(df)

Output:


              Name  Age  Salary
Employee 1   Alice   25   50000
Employee 2     Bob   32   70000
Employee 3  Charlie   28   60000

Last Updated on May 17, 2023 by admin

Leave a Reply

Your email address will not be published. Required fields are marked *

Recommended Blogs