Hitesh Sahu
Hitesh SahuHitesh Sahu
  1. Home
  2. ›
  3. posts
  4. ›
  5. …

  6. ›
  7. 5 Multi Class Classification

Loading ⏳
Fetching content, this won’t take long…


💡 Did you know?

🦈 Sharks existed before trees 🌳.

🍪 This website uses cookies

No personal data is stored on our servers however third party tools Google Analytics cookies to measure traffic and improve your website experience. Learn more

Cover Image for Multiclass Classification with Neural Networks

Multiclass Classification with Neural Networks

Learn how to extend binary classification to multiclass classification using neural networks, where the output layer consists of multiple units representing different classes, and the final prediction is made by selecting the class with the highest output value.

Hitesh Sahu
Written by Hitesh Sahu, a passionate developer and blogger.

Fri Feb 27 2026

Share This on

Multiclass Classification with Neural Networks

Extending Binary Classification

In binary classification, our hypothesis outputs a single value:

hΘ(x)∈[0,1]h_\Theta(x) \in [0,1]hΘ​(x)∈[0,1]

For multiclass classification, instead of returning a single value,
our hypothesis returns a vector of probabilities.


Example: Four-Class Classification

Suppose we want to classify an image into one of four categories:

  • Car
  • Pedestrian
  • Truck
  • Motorcycle

Instead of one output unit, we use four output units.


Network Structure

Input Image→Hidden Layers→4 Output Units\text{Input Image} \rightarrow \text{Hidden Layers} \rightarrow \text{4 Output Units}Input Image→Hidden Layers→4 Output Units

Each output unit corresponds to one class.


Output Representation

Our hypothesis now returns:

hΘ(x)=[hΘ(x)1hΘ(x)2hΘ(x)3hΘ(x)4]h_\Theta(x) = \begin{bmatrix} h_\Theta(x)_1 \\ h_\Theta(x)_2 \\ h_\Theta(x)_3 \\ h_\Theta(x)_4 \end{bmatrix}hΘ​(x)=​hΘ​(x)1​hΘ​(x)2​hΘ​(x)3​hΘ​(x)4​​​

Where:

  • hΘ(x)1h_\Theta(x)_1hΘ​(x)1​ → Probability of Car
  • hΘ(x)2h_\Theta(x)_2hΘ​(x)2​ → Probability of Pedestrian
  • hΘ(x)3h_\Theta(x)_3hΘ​(x)3​ → Probability of Truck
  • hΘ(x)4h_\Theta(x)_4hΘ​(x)4​ → Probability of Motorcycle

Training Labels (One-Hot Encoding)

Each training example has a label vector:

y(i)∈R4y^{(i)} \in \mathbb{R}^4y(i)∈R4

Examples:

Car:

[1000]\begin{bmatrix} 1 \\ 0 \\ 0 \\ 0 \end{bmatrix}​1000​​

Motorcycle:

[0001]\begin{bmatrix} 0 \\ 0 \\ 0 \\ 1 \end{bmatrix}​0001​​

This is called one-hot encoding.


Example Output

Suppose the network outputs:

hΘ(x)=[0010]h_\Theta(x) = \begin{bmatrix} 0 \\ 0 \\ 1 \\ 0 \end{bmatrix}hΘ​(x)=​0010​​

This means:

hΘ(x)3=1h_\Theta(x)_3 = 1hΘ​(x)3​=1

So the predicted class is the third category.

If we defined:

1 → Car
2 → Pedestrian
3 → Truck
4 → Motorcycle

Then the model predicts:

Truck\text{Truck}Truck

Decision Rule

In practice, we select:

Prediction=arg⁡max⁡khΘ(x)k\text{Prediction} = \arg\max_k h_\Theta(x)_kPrediction=argkmax​hΘ​(x)k​

That is, we choose the class with the largest output value.


Key Idea

  • Binary classification → 1 output unit
  • Multiclass classification → K output units
  • Output layer size = number of classes
  • Final prediction = index of largest output

Neural networks naturally extend logistic regression to multiple classes by simply increasing the number of output neurons.

AI-DeepLearning/5-Multi-Class-Classification
Let's work together
+49 176-2019-2523
hiteshkrsahu@gmail.com
WhatsApp
Skype
Munich 🥨, Germany 🇩🇪, EU
Playstore
Hitesh Sahu's apps on Google Play Store
Need Help?
Let's Connect
Navigation
  Home/About
  Skills
  Work/Projects
  Lab/Experiments
  Contribution
  Awards
  Art/Sketches
  Thoughts
  Contact
Links
  Sitemap
  Legal Notice
  Privacy Policy

Made with

NextJS logo

NextJS by

hitesh Sahu

| © 2026 All rights reserved.