Machine Learning: Introduction and Core Algorithms
Beginner-friendly introduction to machine learning, covering key concepts, model types, supervised and unsupervised learning, and essential algorithms such as linear regression, logistic regression, decision trees, and clustering.
Machine Learning 🤖
AI
AI is the field of study of "intelligent agents": any device that perceives its environment and takes actions that maximize its chance of successfully achieving its goals
ML
ML is the study of computer algorithms that improve automatically through experience.
- ML is Subset of AI
- Learning from data
- Improving performance (P) with experience(E) while performing Task (T)
Older definition -- Arthur Samuel (1959)
The field of study that gives computers the ability to learn without being explicitly programmed.
Modern definition -- Tom Mitchell (1998)
A program learns from:
E(Experience)- User-labeled emailsT(Task) - Classify emails as spam or not spamP(Performance measure) - Fraction of correctly classified emails
If performance on task T, measured by P, improves with experience E, then it is learning.
Use Cases
ML is powerful when:
- Handling problems too complex to hard-code
- Finding hidden patterns in large datasets
1. Large Datasets Exist
- Web Analytics data
- Medical records
- Biological data
2. Problems Are Hard to Hand-Code
- Autonomous Drive
- Handwriting recognition
- NLP(Natural Language Processing)
- Computer vision
3. Self-Customizing Systems
- Amazon recommendations
- Netflix recommendations
ML Algos types
1. Supervised Learning
You give the algorithm input data and the correct outputs (“right answers”), and it learns to predict outputs for new inputs.
So every training example has:
- Input features (x)
- Correct output label (y)
Example
- Spam filtering with labeled emails
- Diabetes classification with labeled patients
- Cancer Type Prediction
Types
1.1 Regression
Regression means predicting a continuous value output.
Example: Housing Price Prediction (Regression)
Predict the price of a house based on its size.
- Feature (x): House size (square feet)
- Output (y): Price (continuous value)
We are given historical data:
| Size (sq ft) | Price ($) |
|---|---|
| 1000 | 200000 |
| 1500 | 300000 |
| 2000 | 400000 |
The algorithm may:
- Fit a straight line (
Linear Regression) - Fit a quadratic curve (
Polynomial Regression)
Different models may produce different predictions.

1.2 Classification
Classification means predicting a
discrete categoryas output
- We train using past labeled examples.
- Only specific categories allowed as output (0 or 1)
Example: Breast Cancer Detection (Classification)
What is the probability this tumor is malignant?
- Malignant (1)
- Benign (0)
Using One Feature
- Feature: Tumor size
- Output: 0 or 1
Even if there are multiple categories:
- 0 → No cancer
- 1 → Type 1 cancer
- 2 → Type 2 cancer
- 3 → Type 3 cancer
It is still classification because the output is from a finite set of categories.
Multiple Features
In real problems, we use more than one feature:
- Tumor size
- Age
- Clump thickness
- Uniformity of cell size
- Uniformity of cell shape
The algorithm learns a decision boundary that separates categories.
2. Unsupervised Learning
In unsupervised learning, there are no labeled outputs. The system tries to find structure in the data.
- No labeled data
- Discovers hidden structure
- Common task: Clustering
- Advanced example: Cocktail Party Problem
In Unsupervised Learning, we are given a dataset with:
- No labels
- No correct answers
- No predefined categories
"Here is the data. Can you find structure in it?"
2.1 Clustering
The algorithm automatically groups similar data points together.
- Used to find patterns
We are not told:
- How many groups exist
- What the groups represent
- Which example belongs to which group
The algorithm discovers that on its own.
Example
- Given market data Identify patterns in buying behavior
- Given news articles data find topics
- Organizing Data Centers logs find machines that frequently work together
- Given Social Network data find groups or communities
- Given customer data find Market Segmentation
- Given Astronomical data find galaxies
2.2 Blind Source Separation
Separating mixed signals into original independent components.
The Cocktail Party Problem
Given only the mixed recordings:
- Detect that multiple sources exist
- Separate them into independent signals
- Recover the original voices
No labels are given:
- We do not tell the algorithm what each voice sounds like
- It discovers structure in the signal
Separate the original voices from mixed signals.
Difference Between Supervised and Unsupervised Learning
| Aspect | Supervised Learning | Unsupervised Learning |
|---|---|---|
| Data | Labeled data (input + correct output) | Unlabeled data (input only) |
| Goal | Learn mapping from input → output | Discover hidden structure or patterns |
| Output Type | Continuous (regression) or discrete (classification) | Clusters, groups, latent structure |
| Example Problem | House price prediction | Customer segmentation |
| Example Problem | Spam detection | Grouping news articles |
| Human Guidance | Requires correct answers during training | No correct answers provided |
| Typical Tasks | Regression, Classification | Clustering, Dimensionality Reduction |
| Evaluation | Compare predictions with true labels | Evaluate structure quality (e.g., cohesion, separation) |
| Use Case | When you know what you want to predict | When you want to explore unknown patterns |
