Self-Learning Resources for AI & Machine Learning (2026)

Whether you're preparing for graduate school, transitioning into ML engineering, or deepening your AI knowledge alongside a degree — these are the resources that actually work, organized by level and goal with honest time estimates.

How to Use This Guide

This guide is organized by topic area, not by a single rigid sequence. Most learners will move through Math → Core ML → Deep Learning → Specialization, but the right path depends on your starting point and goal. Use the Recommended Learning Sequences section at the bottom to map a path to your specific objective.

Every resource listed here has a time estimate — use these to plan realistically. "Complete this specialization" is not a plan; "complete 2 modules per week for 8 weeks" is. Self-learning works best with a weekly schedule and a specific project outcome as your milestone.

For resources linked to graduate program preparation specifically, see the AI Graduate Admissions Guide and the Best Master's in AI programs list.

1. Foundational Math

You don't need to master all of this before starting ML — but you will hit a ceiling without it. Linear algebra and probability are the two highest-leverage investments. Don't skip them to get to the "interesting stuff" faster: every neural network, every optimization algorithm, and every ML paper assumes this foundation.

Linear Algebra (18.06)Top Pick

MIT OpenCourseWare

IntermediateFree40–60 hrs

Gilbert Strang's legendary course. The best linear algebra course available anywhere, free. Essential for understanding PCA, SVD, neural network weight updates, and virtually every ML algorithm.

Essence of Linear Algebra

3Blue1Brown (YouTube)

BeginnerFree4–6 hrs

16-video series that builds geometric intuition for linear algebra concepts. Watch before or alongside Strang's course. Excellent for understanding what's actually happening visually.

Statistics 110: ProbabilityTop Pick

Harvard (YouTube)

IntermediateFree35–50 hrs

Joe Blitzstein's probability course. Covers distributions, conditional probability, Bayes, and expectation — all directly applicable to ML. More rigorous and engaging than most alternatives.

Multivariable Calculus

Khan Academy

BeginnerFree20–30 hrs

Covers gradients, partial derivatives, and the chain rule — what you need for understanding backpropagation. Khan Academy's pacing is slower than OCW but excellent for building comfort with the mechanics.

Mathematics for Machine Learning

Coursera / Imperial College London

BeginnerFree to audit18–24 hrs

Integrated course covering linear algebra, calculus, and statistics through the lens of ML applications. Good for people who want the math tied directly to what they'll use it for.

2. Core Machine Learning

This layer covers the algorithms, theory, and implementation skills that form the foundation of all modern ML. You need to understand supervised and unsupervised learning, model selection, regularization, and the math behind gradient descent before moving to deep learning. The gap between "I ran some ML code" and "I understand what's happening" lives here.

Machine Learning SpecializationTop Pick

Coursera (Andrew Ng / DeepLearning.AI)

BeginnerFree to audit60–80 hrs (3 courses)

The best structured introduction to machine learning. Covers supervised learning, neural networks, decision trees, and recommenders with Python exercises. Ideal as a first ML course — intuitive explanations with enough math to understand what's happening.

CS229: Machine LearningTop Pick

Stanford (YouTube + lecture notes)

IntermediateFree100+ hrs

Stanford's flagship ML graduate course. Lecture notes are a gold standard reference — more rigorous and mathematical than Ng's Coursera version. Read the lecture notes even if you don't watch all the videos. This is the level of theory that MS programs assume.

Hands-On Machine Learning with Scikit-Learn, Keras & TensorFlow

Book (Aurélien Géron, O'Reilly)

Intermediate$45–$6060–80 hrs

The best ML implementation book. Covers the full pipeline from data preprocessing to model deployment with clear code examples. Third edition covers transformers and diffusion models. Required reading for anyone building ML systems.

Elements of Statistical Learning

Hastie, Tibshirani & Friedman (free PDF)

AdvancedFree (PDF)100+ hrs (reference)

The graduate-level statistical learning theory reference. Not a course — a textbook. Read selectively: chapters on trees, SVMs, and regularization are especially useful. MIT and Stanford use this as a core reference.

3. Deep Learning

Deep learning is where modern AI capabilities come from — convolutional networks for vision, transformers for language, diffusion models for generation. This section covers the theory and implementation skills for deep learning across modalities. You should have core ML solid before going deep here.

Deep Learning SpecializationTop Pick

Coursera (Andrew Ng / DeepLearning.AI)

IntermediateFree to audit80–100 hrs (5 courses)

Covers CNNs, RNNs, transformers, and sequence models with implementation in Python. Bridges the gap between core ML and modern deep learning. The transformer course is particularly useful given current industry demand.

Practical Deep Learning for Coders

fast.ai (completely free)

IntermediateFree60–80 hrs

Jeremy Howard's bottom-up course builds intuition through doing. You train competitive image classifiers and NLP models in the first session. Better for building projects quickly than for theory depth. Pairs well with Ng's courses.

Deep Learning (Goodfellow, Bengio & Courville)

Free online (deeplearningbook.org)

AdvancedFree (online) / ~$80 (print)Reference

The standard graduate-level deep learning textbook. Covers everything from optimization theory to generative models. Read chapter by chapter in parallel with a course, not as a standalone resource.

CS231N: Convolutional Neural Networks for Visual Recognition

Stanford (YouTube + course notes)

AdvancedFree50–70 hrs

The best computer vision course available. Covers CNNs, object detection, segmentation, and generative models for images. Course notes are reference-quality. Essential for computer vision or ML engineering roles.

4. Large Language Models & Generative AI

LLMs are the highest-demand specialization in the current AI job market. Understanding transformers at the implementation level — not just API-calling level — is what separates candidates who can build AI systems from those who can only use them. These resources go deep on how LLMs actually work.

Neural Networks: Zero to HeroTop Pick

Andrej Karpathy (YouTube)

IntermediateFree25–35 hrs

Karpathy (ex-OpenAI, Tesla AI director) builds transformers and GPT from scratch in NumPy and PyTorch. The clearest explanation of how modern LLMs actually work at the implementation level. Widely considered the best hands-on LLM curriculum available.

CS224N: NLP with Deep Learning

Stanford (YouTube)

AdvancedFree60–80 hrs

Stanford's NLP course covering word vectors, RNNs, transformers, and large language models. Lecture videos are free on YouTube. Strong theoretical foundation for NLP engineering roles.

Hugging Face NLP Course

Hugging Face (free)

IntermediateFree20–30 hrs

Practical introduction to the Hugging Face ecosystem — the standard tooling for most production NLP work. Covers fine-tuning, tokenization, and model deployment. Directly applicable to real NLP engineering tasks.

LLM Engineering: Master AI & LLMs

Udemy (various instructors)

Intermediate$15–$25 (on sale)20–40 hrs

Covers RAG pipelines, agent frameworks, and production LLM deployment. More applied and current than academic courses. Quality varies by instructor — read reviews carefully before purchasing.

5. Building Your Portfolio

Self-study without a portfolio is invisible. The deliverable that matters — for graduate school applications, for job applications, and for interviews — is demonstrated work you can point to.

What makes a strong ML portfolio project

  • It solves a specific problem with a clear dataset and measurable outcome — not "I explored machine learning on some data."
  • It shows the full pipeline: data acquisition/cleaning → exploratory analysis → modeling → evaluation → iteration. Projects that start at "I loaded a clean dataset from Kaggle" are less impressive than projects showing you had to do the messy work.
  • It's documented well. A README that explains what you did, why, what you tried, what didn't work, and what the results were. Admissions committees and interviewers read READMEs.
  • It's reproducible. Clean code, requirements.txt or environment file, instructions to run. Code that only works on your machine doesn't demonstrate engineering competency.

The highest-value portfolio activities

Reproduce a paper

Take a published ML paper and implement it from scratch in PyTorch without using the authors' code. This is the single most impressive thing you can put in a portfolio — it demonstrates both theoretical understanding and implementation skill. Start with simpler papers (pre-2018 CNNs, basic transformers).

Kaggle competitions

Structured practice on real datasets with measurable outcomes. A top-10% finish in a Kaggle competition is a legitimate portfolio signal. More importantly, competitions force you to iterate systematically — reading what other competitors did in post-mortems is one of the best learning activities available.

Domain-specific fine-tuning

Take an open-source model (LLaMA, BERT, Whisper) and fine-tune it on a domain-specific dataset you've assembled yourself. The data collection and curation process is itself a portfolio signal. Healthcare, legal, and scientific text are high-value domains.

End-to-end deployed application

A model that's actually deployed — even as a simple FastAPI endpoint or Streamlit app — is more impressive than a notebook. Deployment shows you understand the difference between a model and a system. Include monitoring or logging to show you're thinking about production.

Recommended Learning Sequences

A sequence for three common goals — use these as starting templates, not rigid prescriptions.

Applying to AI/ML master's programs in 12 months

  1. Months 1–2: Linear algebra (MIT 18.06 or 3Blue1Brown) + Python basics if needed
  2. Months 2–4: Andrew Ng's Machine Learning Specialization (Coursera)
  3. Months 4–6: Deep Learning Specialization (Coursera) + start reading CS229 notes
  4. Months 6–9: Build 2–3 portfolio projects; reproduce one simple ML paper
  5. Months 9–12: Work on a Kaggle competition; document everything on GitHub; write Statement of Purpose

Transitioning into ML engineering from software engineering

  1. Week 1–4: Andrew Ng's Machine Learning Specialization to close theory gaps
  2. Month 2: fast.ai Practical Deep Learning — get hands-on with PyTorch quickly
  3. Month 3: Andrej Karpathy's Zero to Hero series to understand LLMs from scratch
  4. Month 4–5: Build an end-to-end ML project (data → training → serving) and document it
  5. Month 6+: Apply to ML roles or apply to MS programs with a strong portfolio

Building LLM / NLP expertise specifically

  1. Start: Andrew Ng's ML Specialization if you don't have an ML foundation
  2. Then: Karpathy's Neural Networks: Zero to Hero (non-negotiable for LLM understanding)
  3. Then: Hugging Face NLP Course (practical tooling)
  4. Then: Stanford CS224N lectures for theory depth
  5. Portfolio: Fine-tune an open-source model on a domain-specific dataset; deploy a simple RAG application

Frequently Asked Questions

Can I learn machine learning on my own without a degree?

Yes — and many successful ML engineers did exactly that before going back for a formal degree or alongside one. The foundational material for machine learning is freely available: Stanford's CS229 lecture notes, MIT's OpenCourseWare, fast.ai, and Andrej Karpathy's YouTube series are all free and genuinely rigorous. Self-taught ML engineers are employable at many companies, particularly at the mid-level with a strong GitHub portfolio. However, top-tier companies (Google, Meta, OpenAI) still filter heavily by academic credential in initial screening — a self-learning path is most powerful when combined with a master's degree, not as a replacement for one.

What math do I need to know before learning machine learning?

The core mathematical prerequisites for ML are: linear algebra (matrix operations, eigenvalues, SVD), multivariable calculus (gradients, partial derivatives, chain rule), probability and statistics (Bayes' theorem, distributions, expectation, variance), and basic optimization (gradient descent, convexity). You don't need to master all of these before starting — Andrew Ng's Machine Learning Specialization on Coursera introduces concepts gradually. However, to truly understand what's happening in neural networks and to read ML papers, you need these foundations solid. Gilbert Strang's Linear Algebra (MIT OCW) and Harvard's Statistics 110 (YouTube) are the highest-leverage starting points.

What is the best free course to learn machine learning?

The best free ML courses depend on your learning style. For structured, beginner-friendly learning: Andrew Ng's Machine Learning Specialization on Coursera (auditable for free, ~3 months at 10 hrs/week). For deep theoretical grounding: Stanford's CS229 lecture notes and videos (free on YouTube and the Stanford website). For hands-on deep learning: fast.ai's Practical Deep Learning for Coders (completely free, bottom-up approach). For LLMs and transformers specifically: Andrej Karpathy's Neural Networks: Zero to Hero YouTube series (free, builds GPT from scratch). For NLP: Stanford's CS224N lectures (free on YouTube).

How long does it take to learn enough ML to apply to graduate school?

Realistically, building a competitive graduate school application profile through self-study takes 12–18 months of consistent effort (10–15 hours per week) if you're starting with basic programming skills. The sequence: months 1–3 on foundational math (linear algebra, probability), months 3–6 on core ML (Andrew Ng's courses, CS229 notes), months 6–10 on deep learning and a specialization area, months 10–18 on building 3–5 portfolio projects and potentially reproducing a paper. The portfolio and project quality matter more to admissions committees than the specific courses — documented, working code on GitHub with clear documentation is the deliverable that moves your application.

Is fast.ai or Andrew Ng's course better for learning ML?

They suit different learning styles and goals. Andrew Ng's Machine Learning Specialization (Coursera) is top-down with clear explanations, strong math foundations, and structured exercises — ideal for people who want to understand why things work. fast.ai's Practical Deep Learning for Coders is bottom-up: you train state-of-the-art models in the first week and work backward to theory — ideal for people who learn by doing and can tolerate ambiguity. For graduate school preparation, Andrew Ng's course builds the theoretical understanding that admissions committees and ML courses expect. For building portfolio projects quickly, fast.ai is faster. Most strong candidates use both.