Stop Renting Intelligence
A free technical book on when local LLMs beat cloud, which model to pick, and how to ship one without embarrassing yourself.
106 tok/s
on a £140 GPU
6
chapters + 2 appendices
0
fake case studies
An honest note. We have shipped zero paying fine-tunes. Our first fine-tuning attempt scored 33% on eval vs 34% for the base model — net negative. We published the full post-mortem in Appendix A because the failure taught us more than the book's successes. Every benchmark in this book was measured on hardware we paid for. Everything we have not measured is labelled as such.
Benchmarks we actually ran
| Model |
Hardware |
Speed |
Quality |
| Qwen-2.5-7B |
RTX 2070 (£140 used) |
106 tok/s |
8/10 |
| gemma3:4b |
RTX 2070 |
83 tok/s |
7/10 |
| Qwen3-8B |
MacBook Air M5 |
27 tok/s |
Good |
Three data points. That is all we have measured ourselves. The book labels everything else as community benchmarks or estimates.
The six-rule decision framework
From Chapter 1. Screenshot this and send it to your CFO.
- Under 1M tokens/day, no regulation: Stay on cloud APIs. Optimise billing first.
- 1–10M tokens/day, no regulation: Hybrid routing — 70% local, 30% frontier.
- Any volume + GDPR/HIPAA/FCA: Local is the defensible choice. The compliance premium makes it viable from day one.
- Above 10M tokens/day: Two consumer GPUs beat Claude on cost by month 3.
- Above 30M tokens/day: Local at every tier. Economics are overwhelming.
- GPU ownership: Rent until you use it 5+ hours/day. Below 3 hours, always rent.
What's in the book
- Ch 1 Economics — when local beats cloud and by how much
- Ch 2 Hardware — rent, buy, or use what you have
- Ch 3 Model selection — which model, which size, which format
- Ch 4 The craft — fine-tuning and evaluation as one discipline
- Ch 5 Delivery — documentation, deployment, and handoff
- Ch 6 The sale — pricing, positioning, and the three-way math
- App A The V2 overfit — a full post-mortem of our failed fine-tune
- App B Procurement FAQ — for compliance and legal teams
Download the book (free, no gate)
Direct PDF download. No email required.
Download PDF
Want updates when we publish new benchmarks or case studies?
No spam. One email per quarter with new benchmark data and what we learned. Unsubscribe anytime.
Who we are
We are Fathom. We are building a fine-tuning service for local LLMs in regulated industries. We wrote this book because we have to know the material cold anyway, and writing it publicly is a shorter path to competence than hoarding it.
Three benchmarks. One failed fine-tune. One book. Zero pretence.