Machine Learning: Introduction and Core Algorithms
Beginner-friendly introduction to machine learning, covering key concepts, model types, supervised and unsupervised learning, and essential algorithms such as linear regression, logistic regression, decision trees, and clustering.
Machine Learning 🤖
AI
AI is the field of study of "intelligent agents": any device that perceives its environment and takes actions that maximize its chance of successfully achieving its goals
ML
ML is the study of computer algorithms that improve automatically through experience.
- ML is Subset of AI
- Learning from data
- Improving performance (P) with experience(E) while performing Task (T)
Older definition -- Arthur Samuel (1959)
The field of study that gives computers the ability to learn without being explicitly programmed.
Modern definition -- Tom Mitchell (1998)
A program learns from:
E(Experience)- User-labeled emailsT(Task) - Classify emails as spam or not spamP(Performance measure) - Fraction of correctly classified emails
If performance on task T, measured by P, improves with experience E, then it is learning.
Use Cases
ML is powerful when:
1. Large Datasets Exist
- Web Analytics data
- Medical records
- Biological data
2. Problems Are Hard to Hand-Code
- Autonomous Drive
- Handwriting recognition
- NLP(Natural Language Processing)
- Computer vision
3. Self-Customizing Systems
- Amazon recommendations
- Netflix recommendations
Machine Learning Methods
1. Supervised Learning
You give the algorithm input data and the correct outputs (“right answers”), and it learns to predict outputs for new inputs.
Training set:
- You are given labeled data.
$x^{(i)}, y^{(i)}$
where:
- → input features
- → output label
Goal: Learn a function that maps inputs → outputs.
Example:
- Find a decision boundary separating positive and negative examples.
- Spam filtering with labeled emails
- Diabetes classification with labeled patients
- Cancer Type Prediction
1.1 Regression
Regression means predicting a continuous value output.
Algorithms studied:
- Linear Regression
- Logistic Regression
- Neural Networks
- Support Vector Machines (SVM)
These methods learn a function:
used for prediction or classification.
Example
Housing Price Prediction (Regression)
Predict the price of a house based on its size.
- Feature (x): House size (square feet)
- Output (y): Price (continuous value)
We are given historical data:
| Size (sq ft) | Price ($) |
|---|---|
| 1000 | 200000 |
| 1500 | 300000 |
| 2000 | 400000 |
The algorithm may:
- Fit a straight line (
Linear Regression) - Fit a quadratic curve (
Polynomial Regression)
Different models may produce different predictions.

1.2 Classification
Classification means predicting a
discrete categoryas output
- We train using past labeled examples.
- Only specific categories allowed as output (0 or 1)
Using One Feature
Breast Cancer Detection (Classification)
What is the probability this tumor is malignant?
- Feature: Tumor size
- Output: 0 or 1
- Malignant (1)
- Benign (0)
Multiple Features
More than one feature:
- Tumor size
- Age
- Clump thickness
- Uniformity of cell size
- Uniformity of cell shape
Multiple Output categories:
- 0 → No cancer
- 1 → Type 1 cancer
- 2 → Type 2 cancer
- 3 → Type 3 cancer
It is still classification because the output is from a finite set of categories.
The algorithm learns a decision boundary that separates categories.
2. Unsupervised Learning
There are no labeled input. The system tries to find structure in the data.
Goal: discover hidden structure in data.
Algorithms studied:
- K-Means Clustering
- Principal Component Analysis (PCA) for dimensionality reduction
- Anomaly Detection
Training set:
Unsupervised learning uses unlabeled data:
- No labels
- No correct answers
- No predefined categories
Where:
- is the input (features)
- There are no y labels.
Goal
Discover hidden structure in the data.
"Here is the data. Can you find structure in it?"
We do not tell the algorithm what the correct output is.
We ask it to find patterns on its own.
- Discovers hidden structure
- Common task: Clustering
- Advanced example: Cocktail Party Problem
2.1 Clustering
The algorithm automatically groups similar data points together.
- Used to find patterns
We are not told:
- How many groups exist
- What the groups represent
- Which example belongs to which group
The algorithm discovers that on its own.
Example
- Given market data Identify patterns in buying behavior
- Given news articles data find topics
- Given Data Centers logs find machines that frequently work together
- Given Social Network data find groups or communities
- Given customer data find Market Segmentation
- Given Astronomical data find galaxies
2.2 Blind Source Separation
Separating mixed signals into original independent components.
The Cocktail Party Problem
Given only the mixed recordings:
- Detect that multiple sources exist
- Separate them into independent signals
- Recover the original voices
No labels are given:
- We do not tell the algorithm what each voice sounds like
- It discovers structure in the signal
Separate the original voices from mixed signals.
Difference Between Supervised and Unsupervised Learning
| Aspect | Supervised Learning | Unsupervised Learning |
|---|---|---|
| Data | Labeled data (input + correct output) | Unlabeled data (input only) |
| Goal | Learn mapping from input → output | Discover hidden structure or patterns |
| Output Type | Continuous (regression) or discrete (classification) | Clusters, groups, latent structure |
| Example Problem | House price prediction | Customer segmentation |
| Example Problem | Spam detection | Grouping news articles |
| Human Guidance | Requires correct answers during training | No correct answers provided |
| Typical Tasks | Regression, Classification | Clustering, Dimensionality Reduction |
| Evaluation | Compare predictions with true labels | Evaluate structure quality (e.g., cohesion, separation) |
| Use Case | When you know what you want to predict | When you want to explore unknown patterns |
3. Special Applications
Several practical ML systems were discussed:
- Recommender Systems
- Large-scale machine learning
- Parallel and distributed learning
- Computer vision using sliding window object detection
These show how ML is applied in real-world systems.
Building Machine Learning Systems
How to make ML systems work in practice.
Important concepts:
Bias vs Variance
- High bias → underfitting
- High variance → overfitting
Regularization helps control variance.
Evaluating Learning Algorithms
Proper evaluation is essential.
Data is usually split into:
- Training set
- Cross-validation set
- Test set
Common evaluation metrics:
- Precision
- Recall
- F1 Score
Debugging and Improving ML Systems
Tools discussed for diagnosing problems:
- Learning curves
- Error analysis
- Ceiling analysis
These techniques help answer:
What should we work on next to improve the system?
