About

LLM Dojo

What

An open-source curriculum for learning large language model fine-tuning and inference optimization from scratch. 83 Jupyter notebooks organized into 7 stages. Every notebook runs on Google Colab free tier (T4 GPU, 15 GB VRAM).

Coverage

Stage 00

Foundations — transformers, tokenization, datasets

Stage 01

Full fine-tuning — classification, generation, custom loss

Stage 02

PEFT — LoRA, QLoRA, adapters, prompt tuning, DoRA

Stage 03

Optimization — FlashAttention, DeepSpeed, FSDP

Stage 04

Alignment — RLHF, DPO, Constitutional AI, MoE

Stage 05

Production — CUDA, Triton, quantization, vLLM

Inference

KV cache, speculative decoding, GPTQ, AWQ, GGUF

Prerequisites

Basic Python and PyTorch. You should know what a tensor is and how autograd works. No prior LLM experience required.

License

MIT. Fork it, extend it, use it commercially.

Start →GitHub