Hitesh Sahu
Hitesh SahuHitesh Sahu
  1. Home
  2. โ€บ
  3. posts
  4. โ€บ
  5. โ€ฆ

  6. โ€บ
  7. 10 Neural Network Training

Loading โณ
Fetching content, this wonโ€™t take longโ€ฆ


๐Ÿ’ก Did you know?

๐Ÿคฏ Your stomach gets a new lining every 3โ€“4 days.

๐Ÿช This website uses cookies

No personal data is stored on our servers however third party tools Google Analytics cookies to measure traffic and improve your website experience. Learn more

Loading โณ
Fetching content, this wonโ€™t take longโ€ฆ


๐Ÿ’ก Did you know?

๐Ÿฆˆ Sharks existed before trees ๐ŸŒณ.
AI-DeepLearning

    AI-AgenticAI

    AI-DeepLearning
    • Deep Learning Path ๐Ÿค–

    • Neural Network Hypothesis and Intuition

    • Forward Propagation in Neural Networks

    • Vectorized Neural Networks Model Representation

    • Examples and Intuitions I โ€” Neural Networks as Logical Gates

    • Examples and Intuitions II โ€” Building XNOR with a Hidden Layer

    • Multiclass Classification with Neural Networks

    • Cost Function for Neural Networks

    • Backpropagation Algorithm

    • Gradient Checking and Random Initialization

    • Training a Neural Network

    • Revision Cheat Sheet

    • AI-DeepLearning Index


    AI-GenAI

    AI-Infrastructure

    AI-Machine-Learning

    AI-Math

    AWS

    Azure

    Hobbies

    kubernetes

    Management

    Programming

    Terraform

    Z_Appendix

    0-root

Cover Image for Training a Neural Network
AI-DeepLearning

Training a Neural Network

In this post, we will put together all the pieces we've learned about neural networks to understand how to train a neural network effectively. We will cover the cost function, backpropagation, gradient checking, and random initialization, along with key intuitions for each step.

Data Science
Machine Learning
Deep Learning
Neural Networks
Artificial Intelligence
Computational Graphs
โ† Previous

Neural Network Hypothesis and Intuition

Next โ†’

Forward Propagation in Neural Networks

Training a Neural Network

Putting It Together

Now that we have covered forward propagation, backpropagation, and gradient checking, letโ€™s combine everything into a complete training pipeline.

1. ๐Ÿ”€ Choose a Network Architecture

First, decide the structure of your neural network:

  • Number of layers LLL
  • Number of hidden units per layer jjj
  • Number of Outputs yyy

How to choose Network

  • Input layer size = dimension of feature vector x(i)x^{(i)}x(i)
  • Output layer size = number of output classes
  • Hidden units:
    • More units usually perform better
    • But increase computational cost
  • Default choice:
    • Use 1 hidden layer
    • If using multiple hidden layers, use the same number of units in each layer

2. ๐Ÿ“š Training a Neural Network

2.1 ๐ŸŽฒ Randomly Initialize Weights

Initialize each ฮ˜(l)\Theta^{(l)}ฮ˜(l) randomly (not to zero).

This breaks symmetry and allows learning.

2.2 โฉ Forward Propagation (FP)

For each training example x(i)x^{(i)}x(i), compute:

hฮ˜(x(i))h_\Theta(x^{(i)})hฮ˜โ€‹(x(i))

This gives the networkโ€™s prediction.

2.3 ๐Ÿ’ฐ Implement the Cost Function

Compute:

J(ฮ˜)J(\Theta)J(ฮ˜)

This includes:

  • Logistic loss over all output units
  • Regularization term

2.4 โช Backpropagation (BP)

Use backpropagation to compute:

โˆ‚โˆ‚ฮ˜i,j(l)J(ฮ˜)\frac{\partial}{\partial \Theta_{i,j}^{(l)}} J(\Theta)โˆ‚ฮ˜i,j(l)โ€‹โˆ‚โ€‹J(ฮ˜)

This gives the gradients needed for optimization.

2.5 ๐ŸŽข Gradient Checking

Use numerical approximation to verify backpropagation:

โˆ‚โˆ‚ฮ˜J(ฮ˜)โ‰ˆJ(ฮ˜+ฯต)โˆ’J(ฮ˜โˆ’ฯต)2ฯต\frac{\partial}{\partial \Theta} J(\Theta) \approx \frac{J(\Theta + \epsilon) - J(\Theta - \epsilon)}{2\epsilon}โˆ‚ฮ˜โˆ‚โ€‹J(ฮ˜)โ‰ˆ2ฯตJ(ฮ˜+ฯต)โˆ’J(ฮ˜โˆ’ฯต)โ€‹

โš ๏ธ Once verified:

  • Disable gradient checking
  • It is computationally expensive

2.6 โš–๏ธ Minimize the Cost Function

Use:

  • Gradient descent, or
  • A built-in optimization algorithm (e.g., advanced optimizers)

to minimize J(ฮ˜)J(\Theta)J(ฮ˜).


Training Loop

During training, we iterate over all examples:

for i = 1:m
    % Forward propagation
    % Compute activations a^(l)

    % Backpropagation
    % Compute delta terms d^(l) for l = 2,...,L
end

For each example:

  • Perform forward pass
  • Compute errors
  • Accumulate gradients

Final Insight

Neural network training is simply:

  • Forward propagation
  • Backpropagation
  • Gradient-based optimization

All of deep learning is built on this foundation.

Complete Neural Network Workflow

  1. Choose architecture
  2. Initialize weights randomly
  3. Implement forward propagation
  4. Implement cost function
  5. Implement backpropagation
  6. Perform gradient checking
  7. Optimize using gradient descent
  8. Train until convergence
Hitesh Sahu
Written by Hitesh Sahu, a passionate developer and blogger.

Fri Feb 27 2026

Share This on

โ† Previous

Neural Network Hypothesis and Intuition

Next โ†’

Forward Propagation in Neural Networks

AI-DeepLearning/10-Neural-Network-Training
Let's work together
+49 176-2019-2523
hiteshkrsahu@gmail.com
WhatsApp
Skype
Munich ๐Ÿฅจ, Germany ๐Ÿ‡ฉ๐Ÿ‡ช, EU
Playstore
Hitesh Sahu's apps on Google Play Store
Need Help?
Let's Connect
Navigation
ย  Home/About
ย  Skills
ย  Work/Projects
ย  Lab/Experiments
ย  Contribution
ย  Awards
ย  Art/Sketches
ย  Thoughts
ย  Contact
Links
ย  Sitemap
ย  Legal Notice
ย  Privacy Policy

Made with

NextJS logo

NextJS by

hitesh Sahu

| ยฉ 2026 All rights reserved.