Normal Equation in Linear Regression

ML | Normal Equation in Linear Regression

Normal Equation is an analytical approach to Linear Regression with a Least Square Cost Function. We can directly find out the value of θ without using Gradient Descent. Following this approach is an effective and time-saving option when are working with a dataset with small features.
Normal Equation is a follows :



In the above equation,
θ: hypothesis parameters that define it the best.
X: Input feature value of each instance.
Y: Output value of each instance.


Maths Behind the equation –

Given the hypothesis function

n: the no. of features in the data set.
x0: 1 (for vector multiplication)
Notice that this is a dot product between θ and x values. So for the convenience to solve we can write it as :

The motive in Linear Regression is to minimize the cost function :

J(\Theta) = \frac{1}{2m} \sum_{i = 1}^{m} \frac{1}{2} [h_{\Theta}(x^{(i)}) – y^{(i)}]^{2}

xi: the input value of iih training example.
m: no. of training instances
n: no. of data-set features
yi: the expected result of ith instance
Let us representing the cost function in a vector form.

we have ignored 1/2m here as it will not make any difference in the working. It was used for mathematical convenience while calculation gradient descent. But it is no more needed here.



xij: value of jih feature in iih training example.
This can further be reduced to

X\theta - y
But each residual value is squared. We cannot simply square the above expression. As the square of a vector/matrix is not equal to the square of each of its values. So to get the squared value, multiply the vector/matrix with its transpose. So, the final equation derived is


Therefore, the cost function is




So, now getting the value of θ using derivative




















So, this is the finally derived Normal Equation with θ giving the minimum cost value.




# This code may not run on GFG IDE
# as required modules not found.
# import required modules
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_regression
# Create data set.
x,y=make_regression(n_samples=100,n_features=1,n_informative=1,noise = 10,random_state=10)
# Plot the generated data set.
plt.xlabel("Feature_1 --->")
plt.ylabel("Target_Variable --->")
plt.title('Simple Linear Regression')
# Convert  target variable array from 1d to 2d.

Let’s implement  the Normal Equation:

# code
# Adding x0=1 to each instance
# Using Normal Equation.
# Display best values obtained.
[[ 0.52804151]

Try to predict for new data instance:

# code
# sample data instance.
# Adding x0=1 to each instance.
# Display the sample.
print("Before adding x0:\n",x_sample)
print("After adding x0:\n",x_sample_new)
Before adding x0:
 [ 4]]
After adding x0:
 [[ 1. -2.]
 [ 1.  4.]]
# code
# predict the values for given data instance.

Plot the output:

# code
# Plot the output.
plt.xlabel("Feature_1 --->")
plt.ylabel("Target_Variable --->")
plt.title('Simple Linear Regression')

Verify the above using sklearn LinearRegression class:

# code
# Verification.
from sklearn.linear_model import LinearRegression
lr=LinearRegression()    # Object.,y)              # fit method.
# Print obtained theta values.
print("Best value of theta:",lr.intercept_,lr.coef_,sep='\n')
print("predicted value:",lr.predict(x_sample),sep='\n')
Best value of theta:

predicted value:

Last Updated on March 1, 2022 by admin

Leave a Reply

Your email address will not be published. Required fields are marked *

Recommended Blogs