️☠️ Advance MultiVariant Linear Algebra
Detailed explanation of the Normal Equation for linear regression, including matrix formulation, closed-form solution, comparison with gradient descent, and practical considerations for implementation.
Mathematical object
A mathematical object is an abstract concept which can be a value that can be assigned to a symbol, and therefore can be involved in formulas.
- Examples numbers, expressions, shapes, functions, and sets.
- Complex Objects: theorems, proofs.

Tensor
Algebraic object that describes a multilinear relationship between sets of algebraic objects associated with a vector space.
- Latin: tendere meaning 'to stretch'
Scaler()
Scalars are real numbers used in linear algebra
- A single number, a 0-dimensional tensor.
- Example: , , ,
Matrices()
A matrix is a 2D array of numbers or table of numbers.
Representation:
In mathematics:
- Where is a real-valued matrix with 4 rows and 2 columns.
In programming:
A = np.array([[85, 76, 66, 5],
[94, 75, 18, 28],
[68, 40, 71, 5]])
In theory:
- Uppercase letters (A, B, X) → Matrices
- Lowercase letters (x, y, z) → Vectors or scalars
Dimension:
Where:
- = rows
- = columns
- Example matrix in the example
Square Matrix
- A matrix with the same number of rows and columns ().
Element notation:
- The element in the row and column.
Example:
- = 85 → Row 1, Column 1
- → Row 3, Column 2
- → Row 4, Column 1
- → Row 2, Column 3
- → 6th row, 4th column does not exist
Use in Machine Learning
Represents Data Matrix, Model Parameters, Transformations
If we have:
- training examples
- features
The data matrix is:
Dimension where:
- Each row = one training example
- Each column = one feature
Vectors()
A vector is a Matrix with 1 Column
- Represents A point in high-dimensional space
- Latin: vector , meaning "carrier" or "driver"
- Have A direction () & A magnitude
Represent as:
In Maths
In Programming
y = np.array([460, 232, 315, 178])
In Theory:
- Uppercase letters (A, B, X) → Matrices
Dimension:
In ML, vectors represent:
- A data point → a vector
- A feature column → a direction
- A model weight vector → a direction of best fit
Example
- 4 × 1 matrix Or a 4-dimensional vector
Element Indexing
= i-th element.
- In mathematics, indexing usually starts at 1.
- In programming indexing often starts at 0.
- Unless otherwise specified, assume
one-indexed notationin linear algebra.
Example:
Transpose ()
Transpose swaps rows and columns.
If: then:
Element-wise:
- A column vector becomes a row vector.
Given:
Then:
Used heavily in:
- Normal Equation
- Gradient derivations
Identity Matrix ()
The identity matrix is the matrix equivalent of the number 1.
It is a square matrix with:
- 1’s on the diagonal
- 0’s everywhere else
Property:
Inverse Matrix()
The inverse of a matrix is like division.
- Only square matrices can have inverses.
Matrix inverse satisfies:
Used in Normal Equation:
Not all square matrices are invertible.
1. Invertible/ non-singular Matrix
A matrix can be inverted
- it has an inverse if it is full rank (rows and columns are linearly independent).
2. Non-Invertible/ Singular Matrix/ Degenerate Matrix
A matrix that does not have an inverse
- Does not have a inverse because it is not full rank (rows or columns are linearly dependent).
Cause for non invertible Matrix:
- Redundant feature: two feature related by a linear equation x2 = kx1 eg: size in feet and meter
- More feature than training set(m<=n)): delete some feature or use regularization
Octave method for inverting matrix:
- pinv(A) : Pseudo Inverse, calculates inverse even if matrix is non invertible
- inv(A) : Inverse
Determinant ( )
The determinant tells us whether a matrix is invertible.
For a 2 × 2 matrix:
If: : The matrix is invertible.
If: : The matrix is singular (not invertible).
- Either no solution
- Or infinitely many solution
Use in Machine Learning:
- Normal Equation requires matrix inversion.
Closed-form solution:
- In practice, we use numerical methods to avoid instability of matrix inversion.
- Regularization can help make matrices invertible by adding a small value to the diagonal (Ridge Regression).
A matrix is a transformation of space.
All machine learning models are compositions of transformations.
If:
Then transforms vector into a new vector .
Geometrically, a matrix can:
- Stretch
- Compress
- Rotate
- Reflect
- Shear
- Project
Matrix Addition/Subtraction
When Is Addition Allowed?
Addition is done element by element.
Two matrices can be added only if they have the same dimensions.
If:
then:
where
is also an ) matrix.
Subtraction is the same but with minus signs.
Scalar Multiplication / Division
Scalar multiplication is multiplying every element of a matrix by a single number (scalar).
Matrix-Matrix Multiplication
Element-wise: (sum over k)
Given 2 Matrices:
- is or
- is or
Then:
where is a new matrix with dimensions:
- or
- inner dimensions must match (n) :
Properties:
- Not commutative: : Order matters
- Associative:
Use in Machine Learning
Everything in deep learning is matrix multiplication:
- Inputs × Weights
- Weights × Activations
- Gradient updates
Neural network forward pass:
Backpropagation is also matrix calculus.
Understanding multivariate linear algebra makes deep learning much easier to grasp.
Vectorization: Matrix-Vector Multiplication
If:
- is Matrix
- is Vector
Then
- Produces Vector
Use in Machine Learning
- This gives predictions for all training examples in one operation.
- Faster computation: optimized hardware usage (CPU/GPU)
- Clean mathematical formulation
Linear regression hypothesis:
For all training examples:
Dot Product()
The dot product is defined between two vectors of the same dimension.
If:
Then their dot product is:
It produces a single number (a scalar).
Summary
Key ideas:
- Vectors represent features and parameters
- Matrices represent datasets
- Matrix multiplication enables fast prediction
- Transpose and inverse enable optimization
- Vectorization is essential for performance
