How to plot a dataframe using Pandas?
Pandas is one of the most popular Python packages used in data science. Pandas offer a powerful, and flexible data structure ( Dataframe & Series ) to manipulate, and analyze the data. Visualization is the best way to interpret the data.
Python has many popular plotting libraries that make visualization easy. Some of them are matplotlib, seaborn, and plotly. It has great integration with matplotlib. We can plot a dataframe using the plot() method. But we need a dataframe to plot. We can create a dataframe by just passing a dictionary to the DataFrame() method of the pandas library.
Let’s create a simple dataframe:
# importing required library # In case pandas is not installed on your machine # use the command 'pip install pandas'. import pandas as pd import matplotlib.pyplot as plt # A dictionary which represents data data_dict = { 'name' :[ 'p1' , 'p2' , 'p3' , 'p4' , 'p5' , 'p6' ], 'age' :[ 20 , 20 , 21 , 20 , 21 , 20 ], 'math_marks' :[ 100 , 90 , 91 , 98 , 92 , 95 ], 'physics_marks' :[ 90 , 100 , 91 , 92 , 98 , 95 ], 'chem_marks' :[ 93 , 89 , 99 , 92 , 94 , 92 ] } # creating a data frame object df = pd.DataFrame(data_dict) # show the dataframe # bydefault head() show # first five rows from top df.head() |
Output:
Plots
There are a number of plots available to interpret the data. Each graph is used for a purpose. Some of the plots are BarPlots, ScatterPlots, and Histograms, etc.
Scatter Plot:
To get the scatterplot of a dataframe all we have to do is to just call the plot() method by specifying some parameters.
kind='scatter',x= 'some_column',y='some_colum',color='somecolor'
- Python3
# scatter plot df.plot(kind = 'scatter' , x = 'math_marks' , y = 'physics_marks' , color = 'red' ) # set the title plt.title( 'ScatterPlot' ) # show the plot plt.show() |
Output:
There are many ways to customize plots this is the basic one.
Bar Plot:
Similarly, we have to specify some parameters for plot() method to get the bar plot.
kind='bar',x= 'some_column',y='some_colum',color='somecolor'
- Python3
# bar plot df.plot(kind = 'bar' , x = 'name' , y = 'physics_marks' , color = 'green' ) # set the title plt.title( 'BarPlot' ) # show the plot plt.show() |
Output:
Line Plot:
The line plot of a single column is not always useful, to get more insights we have to plot multiple columns on the same graph. To do so we have to reuse the axes.
kind=’line’,x= ‘some_column’,y=’some_colum’,color=’somecolor’,ax=’someaxes’
- Python3
#Get current axis ax = plt.gca() # line plot for math marks df.plot(kind = 'line' , x = 'name' , y = 'math_marks' , color = 'green' ,ax = ax) # line plot for physics marks df.plot(kind = 'line' ,x = 'name' , y = 'physics_marks' , color = 'blue' ,ax = ax) # line plot for chemistry marks df.plot(kind = 'line' ,x = 'name' , y = 'chem_marks' , color = 'black' ,ax = ax) # set the title plt.title( 'LinePlots' ) # show the plot plt.show() |
Output:
Last Updated on October 21, 2021 by admin