Multiple Linear Regression
Multiple Linear Regression, Matrix Method, Coefficients, Least Squares, Multivariate
Multiple Linear Regression extends simple linear regression to handle multiple input features at once. Instead of drawing a line through 2D points, it fits a 'hyperplane' through multi-dimensional data. For example, predicting a house price based on both its size AND its age — not just one factor. The math uses matrices to solve for all the coefficients (b0, b1, b2...) simultaneously, which is exactly the kind of numerical your 5th-semester exams will test.
The Prediction Model & Normal Equation
What do these variables mean?
- YThe predicted output value. This is what we calculate at the very end.
- b_0The intercept (bias). The base value of Y when all features are exactly 0.
- b_1, b_2The coefficients (weights) for each feature. They tell you how much Y changes per 1-unit increase in that specific feature.
- The coefficient vector. This is just a matrix column containing [b_0, b_1, b_2...].
- XThe design matrix. Your dataset's feature values, but with a leading column of all 1s added to calculate b_0.
- The transpose of X (flipping the rows and columns).
- The inverse of the multiplied X matrices. This is the hardest part to calculate manually!
- The Normal EquationThe bottom formula. It calculates every single coefficient in simultaneously in one mathematical sweep.
How Does it Work?
Build the matrix from your dataset: add a leading column of all 1s (for ), then your feature columns side by side.
Build the matrix: a single column of all your output/target values.
Calculate -Transpose () by flipping the rows and columns of your matrix.
Multiply by to get a square matrix. Use standard matrix multiplication row-by-column.
Find the inverse of . For a matrix, use the adjugate and determinant method.
Multiply by to get an intermediate matrix.
Multiply that result by to get your coefficient vector .
Plug and your query values into to get the prediction.
Important Rules & Conventions
- Exam Trick 1: Always write out the matrix first with the column of 1s. Students who forget the leading 1s column get wrong and the entire solution falls apart.
- Exam Trick 2: Double-check your transpose by verifying that element at row , col in equals element at row , col in .
- Exam Trick 3: To verify your inverse is correct, multiply by — you must get the identity matrix (1s on the diagonal, 0s elsewhere).
- The number of rows in your final vector always equals the number of features + 1 (for ). If you have 2 features, you get 3 coefficients: .
Advantages
- ✓ Handles multiple features simultaneously — far more realistic than simple linear regression for real-world data.
- ✓ The matrix formula works for any number of features, making it highly scalable.
- ✓ Each coefficient directly tells you the individual impact of that feature on the output, assuming other features are held constant.
Disadvantages
- × Multicollinearity problem: if two of your input features are strongly correlated (e.g., height in cm and height in inches), the matrix becomes nearly impossible to invert and the coefficients become meaningless.
- × Sensitive to outliers: one extreme data point can shift all coefficients significantly.
- × Requires more data points than features. If you have 3 features but only 2 data points, the system is underdetermined and has no unique solution.
Summary
Multiple Linear Regression is the natural evolution of simple linear regression. By expressing the problem in matrix form, we can solve for all coefficients at once using the Normal Equation. While it demands careful attention to matrix operations, it is one of the most fundamental and interpretable tools in predictive modeling — and a guaranteed exam topic.