Part 1: The Fundamentals of Machine Learning

Overview

Part 1 covers the foundations of machine learning using Scikit-Learn. It focuses on classical ML algorithms, data preprocessing, model evaluation, and building production-ready pipelines.

Chapters

Chapter 1: The Machine Learning Landscape

What is Machine Learning?
Types of ML Systems
Main Challenges of ML
Testing and Validating

Chapter 2: End-to-End Machine Learning Project

Working with Real Data
Data Exploration and Visualization
Preparing Data for ML Algorithms
Select and Train a Model
Fine-Tune Your Model

Chapter 3: Classification

MNIST Dataset
Training a Binary Classifier
Performance Measures
Multiclass Classification
Error Analysis
Multilabel and Multioutput Classification

Chapter 4: Training Models

Linear Regression
Gradient Descent
Polynomial Regression
Regularized Linear Models (Ridge, Lasso, Elastic Net)
Logistic Regression

Chapter 5: Support Vector Machines

Linear SVM Classification
Nonlinear SVM Classification
SVM Regression
Under the Hood

Chapter 6: Decision Trees

Training and Visualizing Decision Trees
Making Predictions
Estimating Class Probabilities
CART Training Algorithm
Regularization Hyperparameters
Regression
Instability

Chapter 7: Ensemble Learning and Random Forests

Voting Classifiers
Bagging and Pasting
Random Forests
Boosting (AdaBoost, Gradient Boosting)
Stacking

Chapter 8: Dimensionality Reduction

The Curse of Dimensionality
Main Approaches for Dimensionality Reduction
PCA
Kernel PCA
LLE, MDS, Isomap, t-SNE

Chapter 9: Unsupervised Learning Techniques

Clustering (K-Means, DBSCAN, etc.)
Gaussian Mixtures
Anomaly Detection
Novelty Detection

Learning Goals

By the end of Part 1, I should be able to:

Build end-to-end ML pipelines with Scikit-Learn
Choose appropriate algorithms for different problem types
Evaluate and fine-tune models effectively
Handle common ML challenges (overfitting, underfitting, curse of dimensionality)
Apply ensemble methods to improve predictions
Perform dimensionality reduction and clustering

Practice Repository

All code implementations and experiments for Part 1 are tracked in:

sklearn-playground

This repo contains:

Chapter-by-chapter implementations
Extended experiments beyond book examples
Custom datasets and challenges
Notes on gotchas and best practices

Key Takeaways

(To be filled as I progress through the chapters)

01 ML Landscape

Learning the jargons

Overview#

Chapters#

Chapter 1: The Machine Learning Landscape#

Chapter 2: End-to-End Machine Learning Project#

Chapter 3: Classification#

Chapter 4: Training Models#

Chapter 5: Support Vector Machines#

Chapter 6: Decision Trees#

Chapter 7: Ensemble Learning and Random Forests#

Chapter 8: Dimensionality Reduction#

Chapter 9: Unsupervised Learning Techniques#

Learning Goals#

Practice Repository#

Key Takeaways#