Linear Regression: Every AI/ML Engineer’s First Step

Machine Learning

Statistics and Data Science

Linear Regression: Every AI/ML Engineer’s First Step

January 28, 2025

Latest Articles

Linear Regression: Every AI/ML Engineer’s First Step

In every AI/ML engineer’s path there’s a 95% possibility they’ve learned linear…

The Beginning of Computer Science: Turing Machine

Have you ever wondered: What is a computer? What is computing? How…

What Are Neural Networks? The Key to Artificial Intelligence (AI) Innovation

Artificial Intelligence (AI) has become a transformative force across industries, boosting efficiency…

Subscribe Newsletter

We are still working on this function.

Introduction

In every AI/ML engineer’s path there’s a 95% possibility they’ve learned
linear regression as their first machine learning model thanks to its simplicity,
interpretability and speed. It is one of the most commonly used algorithms in
Machine Learning problems.

Let’s understand what’s behind this algorithm which makes it so valuable.

What is Linear Regression?

Linear Regression is a supervised learning technique and primarily used for
quantitative problem solving. It attempts to model the relationship between
variables by fitting a linear equation to the observed data.
There are two main types of Linear regression as

Simple Linear Regression
Multiple Linear Regression

Simple Linear Regression

imple Linear Regression is often used for predicting quantitative responses
on the basis of a single predictor variable. Let’s name the predictor variable
as known as the independent variable as X and the response variable aka the
dependent variable as Y.
Simple Linear Regression assumes there’s approximately a linear relationship
between X and Y. Mathematically we can write this relationship as below,

In this equation β0 (beta zero) represent the intercept and β1 (beta one)
represent the slope of the linear model. Together those two are known as
the model coefficients. We use approximately symbol because the equation
indicates an approximation rather than the exact equality because in real life
data there’s always some degree of error that prevents the from being the
exact.
Let’s assume there’s a connection between ‘TV advertising budget’ and the
‘Sales’ of a random company. When we fit a linear model to the training data
the equation can be visualized as,

(Credit – An Introduction to Statistical Learning by Gerath James, Daniela Witten…)

What’s happening while fitting the data is the model learns the relationship
between X and Y and it tunes the coefficient (beta) values to minimize theerror between the predicted value and the true value. This process is also
known as minimizing the loss function.

Let’s measure the error here using Residual Sum of Squares (RSS). RSS is
exactly as it sounds. In the above figure the we can measure the residual for
each prediction and the RSS as below,

To have a better prediction, we need the lowest error possible. As you can
see, we can do this by tuning the beta values to get a more accurate
prediction. Let’s use the loss function Mean Square error (MSE) for this. MSE
is the average of RSS.

By fitting the data, the algorithm minimizes the loss function and gets the
best values for the coefficients to make the model’s predictions as close aspossible to the actual values. That’s how we get the best accuracy for our
simple linear model.

Multiple Linear Regression

When it comes to Multiple Linear Regression there are more predictor
variables than simple Linear Regression but the processing part remains the
same just like in the previous one. It assumes there’s approximately a linear
relationship between the Xn predictor variables and Y. We can visualize it as
below,

Where Xi represents the i th predictor and βi quantifies the association
between that variable and the response.
Let’s look at an example visualization of a connection between two X1, X2
predictor variables and the response Y using Multiple Linear Regression.

As you can see in this three dimensional setting the least square regression
becomes a plane. The plane is chosen to minimize the sum of the squared
vertical distances between each observation and the plane by fitting the data
into the model. While fitting just like in simple linear model this tunes the
coefficient values to get the lowest error by minimizing the loss function to
make the model’s predictions as close as possible to the actual values.

Applications of Linear Regression

Linear regression is widely used in many industries in modern day. Some
examples are,• Health care – Predict patient outcomes based on clinical parameters,
recognize the patterns and connections between diseases

Finance – Stock market predictions, evaluate a company’s operational
performance and forecast returns on investment
Sports – Predict player performance metrics or team outcome
Environment – Predict upcoming weather and natural disasters based
on environmental factors

Conclusion

Linear regression stands as a foundational algorithm in machine learning due
to its simplicity, interpretability and efficiency. By finding the relationships
between variables and minimizing the errors it lays the foundation for more
complex models and algorithms. Linear Regression serves as an entry point
for beginners while remaining a powerful tool.