Machine Learning

roadmap.sh: https://roadmap.sh/machine-learning

Suggested path through the Machine Learning nodes. Each node links to its lesson when written.

Nodes

Math foundations

  • Linear algebra
  • Vector operations
  • Matrix operations
  • Determinants & inverse of matrix
  • Eigenvalues & diagonalization
  • Calculus
  • Derivatives & partial derivatives
  • Chain rule of derivation
  • Gradient, Jacobian, Hessian
  • Discrete mathematics

Probability & statistics

  • Basics of probability
  • Bayes theorem
  • Probability distributions
  • Descriptive statistics
  • Inferential statistics

Programming & tooling

  • Basic syntax
  • Data structures
  • Conditionals
  • Functions & built-in functions
  • Exceptions
  • Essential libraries
  • NumPy
  • Pandas
  • Scikit-learn
  • Version control
  • APIs
  • Databases (SQL / NoSQL)

Data collection & preparation

  • Data sources
  • Data formats (CSV, JSON, Excel)
  • Data loading
  • Data cleaning
  • Data preparation
  • Feature engineering
  • Feature selection
  • Feature scaling / normalization
  • Dimensionality reduction
  • Visualization (graphs & charts)

Core ML concepts

  • Basic concepts
  • Supervised learning
  • Unsupervised learning
  • Reinforcement learning
  • Overfitting
  • Underfitting
  • Regularization (Lasso, Ridge, ElasticNet)

Supervised algorithms

  • Linear regression
  • Logistic regression
  • Decision trees & random forest
  • K-nearest neighbors (KNN)
  • Support vector machines (SVM)
  • Gradient boosting machines
  • XGBoost

Unsupervised algorithms

  • Clustering
  • Hierarchical clustering
  • Dimensionality reduction (PCA)

Model evaluation & selection

  • Model evaluation
  • Model selection
  • Confusion matrix
  • Accuracy
  • Precision
  • Recall
  • F1-score
  • ROC-AUC
  • Log loss
  • Mean squared error
  • Root mean squared error
  • K-fold cross-validation
  • LOOCV
  • Test-train split
  • Validation strategies

Optimization

  • Optimization
  • Gradient descent
  • Stochastic gradient descent (SGD)

Neural networks & deep learning

  • Neural networks
  • Neural network architectures
  • Multilayer perceptron
  • Forward propagation
  • Back propagation
  • Activation functions
  • Softmax
  • Vanishing gradient
  • Deep learning architectures
  • Deep learning libraries
  • TensorFlow
  • Keras
  • PyTorch

Computer vision

  • Convolutional neural network (CNN)
  • Convolution
  • Pooling
  • Applications of CNNs
  • Image classification
  • Image segmentation
  • Object detection
  • Image & video recognition

Sequence models

  • Recurrent neural network (RNN)
  • LSTM
  • GRU
  • Attention mechanisms
  • Attention models
  • Transformers
  • Embeddings
  • Multimodal learning

NLP

  • Natural language processing
  • Text processing
  • Preprocessing
  • Tokenization
  • Word tokenization
  • Stemming
  • Lemmatization
  • Word embeddings
  • Sentiment analysis
  • Qualitative analysis

Generative & advanced models

  • Autoencoders
  • Variational autoencoders
  • Generative adversarial networks (GANs)
  • Transfer learning
  • Recommendation systems
  • Pattern recognition

Reinforcement learning

  • Q-learning
  • Deep Q-networks
  • Actor-critic methods

Production & MLOps

  • Production
  • Quantization
  • Explainable AI
  • Uncertainty estimation

Resources

See resources.md.

Project ideas

  • House-price regression pipeline — clean a tabular dataset, engineer features, compare linear regression, random forest, and XGBoost with k-fold cross-validation, and report RMSE/R².
  • Image classifier with transfer learning — fine-tune a pretrained CNN (e.g. ResNet) on a small custom image dataset in PyTorch or Keras, tracking accuracy and a confusion matrix.
  • Sentiment-analysis NLP service — build a text-preprocessing + embedding + classifier pipeline, wrap it in an API, and serve predictions on movie/product reviews.

1 item under this folder.