In this article, we will explore how to create a Pandas DataFrame from lists in Python. Pandas is a powerful library for data manipulation and analysis, and it provides a convenient way to work with tabular data through its DataFrame object.
Step 1: Importing the Pandas Library
To get started, we need to import the Pandas library. This can be done using the following code:
import pandas as pd
Step 2: Creating Lists
Next, we need to create the lists that will serve as the data for our DataFrame. Let’s say we have three lists representing the names, ages, and salaries of a group of employees:
names = ['Alice', 'Bob', 'Charlie', 'David'] ages = [25, 32, 28, 41] salaries = [50000, 70000, 60000, 80000]
Step 3: Creating the DataFrame
Now, we can create the DataFrame using the pd.DataFrame()
method. We pass the lists as a dictionary, where the keys represent the column names:
data = {'Name': names, 'Age': ages, 'Salary': salaries} df = pd.DataFrame(data)
Step 4: Displaying the DataFrame
To see the contents of the DataFrame, we can simply print it:
print(df)
Output:
Name Age Salary 0 Alice 25 50000 1 Bob 32 70000 2 Charlie 28 60000 3 David 41 80000
Step 5: Accessing DataFrame Elements
We can access specific elements of the DataFrame using various indexing techniques. For example, to retrieve the name of the second employee, we can use the following code:
name = df.loc[1, 'Name'] print(name)
Output:
Bob
Step 6: Performing Operations on DataFrame
Pandas provides a wide range of operations that can be performed on DataFrames. For instance, we can calculate the average age of the employees using the mean()
method:
average_age = df['Age'].mean () print(average_age)
Output:
31.5
Create a simple DataFrame from three lists: names, ages, and salaries.
import pandas as pd names = ['Alice', 'Bob', 'Charlie'] ages = [25, 32, 28] salaries = [50000, 70000, 60000] data = {'Name': names, 'Age': ages, 'Salary': salaries} df = pd.DataFrame(data) print(df)
Output:
Name Age Salary
0 Alice 25 50000
1 Bob 32 70000
2 Charlie 28 60000
Create a DataFrame with additional columns and rows.
import pandas as pd names = ['Alice', 'Bob', 'Charlie'] ages = [25, 32, 28] salaries = [50000, 70000, 60000] departments = ['Sales', 'Marketing', 'Finance'] data = {'Name': names, 'Age': ages, 'Salary': salaries, 'Department': departments} df = pd.DataFrame(data) print(df)
Output:
Name Age Salary Department
0 Alice 25 50000 Sales
1 Bob 32 70000 Marketing
2 Charlie 28 60000 Finance
Create a DataFrame with specified column names and index.
import pandas as pd data = [['Alice', 25, 50000], ['Bob', 32, 70000], ['Charlie', 28, 60000]] columns = ['Name', 'Age', 'Salary'] index = ['Employee 1', 'Employee 2', 'Employee 3'] df = pd.DataFrame(data, columns=columns, index=index) print(df)
Output:
Name Age Salary
Employee 1 Alice 25 50000
Employee 2 Bob 32 70000
Employee 3 Charlie 28 60000
Last Updated on May 17, 2023 by admin