Neural Networks Introduction
A concise cheat sheet covering core concepts, dimensions, activation functions, forward propagation, cost function, backpropagation, gradient checking, random initialization, training pipeline, and key intuition for neural networks.
Neural Networks — Revision Cheat Sheet
1️⃣ Core Concepts
Neural Network Structure
- Input layer
- Hidden layer(s)
- Output layer
Each layer computes:
Add bias unit:
2️⃣ Dimensions
If:
- Layer has units
- Layer has units
Then:
- +1 accounts for bias
- Output layer size = number of classes
3️⃣ Activation Function
Most common: Sigmoid
Derivative (important for backprop):
4️⃣ Forward Propagation
For each layer:
-
Compute:
-
Apply activation:
Final output:
5️⃣ Cost Function (Multiclass)
- Double sum → over training examples and outputs
- Regularization → sum of squared weights
- Bias terms NOT regularized
6️⃣ Backpropagation
Output Layer Error
Hidden Layer Error
Gradient Accumulation
Final Gradient
For non-bias terms:
For bias terms: (no regularization)
7️⃣ Gradient Checking
Numerical approximation:
Use:
- Only for debugging
- Disable after verification
8️⃣ Random Initialization
Do NOT initialize weights to zero.
Initialize:
Using:
Theta = rand(x,y) * (2*epsilon) - epsilon;
This breaks symmetry.
9️⃣ Training Pipeline
- Choose architecture
- Randomly initialize weights
- Forward propagation
- Compute cost
- Backpropagation
- Gradient checking (once)
- Optimize using gradient descent
- Repeat until convergence
🔟 Key Intuition
Neural network training is:
- Forward pass → prediction
- Backward pass → compute gradients
- Gradient descent → update weights
Deep learning = repeated application of this process.
