Math for AI/ML
This folder covers the core mathematics needed to deeply understand how AI and LLMs work — not just use them, but truly understand what’s happening under the hood.
Why Math First?
You can use AI tools without math. But to understand:
- Why a model learns what it learns
- What backpropagation is actually doing
- How attention in a Transformer works mathematically
…you need this foundation. Every concept here maps directly to something in LLM training.
Topics
| Topic | Folder | Status | |
|---|---|---|---|
| Linear Algebra | linear-algebra | ✅ Completed | 21.03.2026 |
| Calculus | calculus/ |
⏳ Not Started | |
| Probability & Statistics | probability-stats/ |
⏳ Not Started |
Linear Algebra
Resource: 3Blue1Brown — Essence of Linear Algebra
Why this resource?
Builds geometric intuition first — you see what vectors and matrices are doing before touching formulas. Best intro to linear algebra on the internet, period.
Topics
- Topic 1 — Vectors, what even are they?
- Topic 2 — Linear combinations, span, and basis vectors
- Topic 3 — Linear transformations and matrices
- Topic 4 — Matrix multiplication as composition
- Topic 5 — The determinant
- Topic 6 — Inverse matrices, column space, and null space
- Topic 7 — Dot products and duality
- Topic 8 — Cross products
- Topic 9 — Change of basis
- Topic 10 — Eigenvectors and eigenvalues
Direct ML Relevance
| Concept | Where it shows up in AI |
|---|---|
| Matrix multiplication | Forward pass of every neural network |
| Dot product | Attention score computation in Transformers |
| Eigenvectors | PCA, understanding weight matrices |
| Linear transformations | What every layer in a neural net is doing |
Calculus
Resource: 3Blue1Brown — Essence of Calculus (up next)
Direct ML Relevance
| Concept | Where it shows up in AI |
|---|---|
| Derivatives | Gradient computation |
| Chain rule | Backpropagation |
| Partial derivatives | Loss w.r.t. each weight in a network |
Probability & Statistics
Resource: Harvard Stat 110 — Blitzstein (up next)
Direct ML Relevance
| Concept | Where it shows up in AI |
|---|---|
| Cross-entropy | The loss function used to train LLMs |
| KL Divergence | Used in RLHF and VAEs |
| Bayes Theorem | Foundation of probabilistic models |
| Distributions | Sampling, temperature, model outputs |
Approach
- Watch the video for intuition
- Write notes in your own words in the topic folder
- Implement the concept in NumPy
- Commit — each concept gets its own commit
“If you can’t explain it simply, you don’t understand it well enough” — Feynman