Python | Pandas dataframe.clip()
Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier.
Pandas dataframe.clip()
is used to trim values at specified input threshold. We can use this function to put a lower limit and upper limit on the values that any cell can have in the dataframe.
Syntax: DataFrame.clip(lower=None, upper=None, axis=None, inplace=False, *args, **kwargs)
Parameters:
lower : Minimum threshold value. All values below this threshold will be set to it.
upper : Maximum threshold value. All values above this threshold will be set to it.
axis : Align object with lower and upper along the given axis.
inplace : Whether to perform the operation in place on the data.
*args, **kwargs : Additional keywords have no effect but might be accepted for compatibility with numpy.
Example #1: Use clip()
function to trim values of a data frame below and above a given threshold value.
# importing pandas as pd import pandas as pd # Creating a dataframe using dictionary df = pd.DataFrame({ "A" :[ - 5 , 8 , 12 , - 9 , 5 , 3 ], "B" :[ - 1 , - 4 , 6 , 4 , 11 , 3 ], "C" :[ 11 , 4 , - 8 , 7 , 3 , - 2 ]}) # Printing the data frame for visualization df |
Now trim all the values below -4 to -4 and all the values above 9 to 9. Values in-between -4 and 9 remaining the same.
# Clip in range (-4, 9) df.clip( - 4 , 9 ) |
Output :
Notice, there is not any value in the data frame greater than 9 and smaller than -4
Example #2: Use clip()
function to clips using specific lower and upper thresholds per column element in the dataframe.
# importing pandas as pd import pandas as pd # Creating a dataframe using dictionary df = pd.DataFrame({ "A" :[ - 5 , 8 , 12 , - 9 , 5 , 3 ], "B" :[ - 1 , - 4 , 6 , 4 , 11 , 3 ], "C" :[ 11 , 4 , - 8 , 7 , 3 , - 2 ]}) # Printing the dataframe df |
when axis=0
, then the value will be clipped across the rows. We are going to provide upper and lower threshold for all the column element (i.e. equivalent to the no. of rows)
Creating a Series to store the lower and upper threshold value for each column element.
# lower limit for each individual column element. lower_limit = pd.Series([ 1 , - 3 , 2 , 3 , - 2 , - 1 ]) # upper limit for each individual column element. upper_limit = lower_limit + 5 # Print lower_limit lower_limit # Print upper_limit upper_limit |
Output :
Now we want to apply these limits on the dataframe.
# applying different limit value for each column element df.clip(lower_limit, upper_limit, axis = 0 ) |
Output :
Last Updated on October 24, 2021 by admin