About

LLM Dojo

What

An open-source curriculum for learning large language model fine-tuning and inference optimization from scratch. 83 Jupyter notebooks organized into 7 stages. Every notebook runs on Google Colab free tier (T4 GPU, 15 GB VRAM).

Coverage

Stage 00
Foundations — transformers, tokenization, datasets
Stage 01
Full fine-tuning — classification, generation, custom loss
Stage 02
PEFT — LoRA, QLoRA, adapters, prompt tuning, DoRA
Stage 03
Optimization — FlashAttention, DeepSpeed, FSDP
Stage 04
Alignment — RLHF, DPO, Constitutional AI, MoE
Stage 05
Production — CUDA, Triton, quantization, vLLM
Inference
KV cache, speculative decoding, GPTQ, AWQ, GGUF

Prerequisites

Basic Python and PyTorch. You should know what a tensor is and how autograd works. No prior LLM experience required.

License

MIT. Fork it, extend it, use it commercially.