Hardware & Architecture

What is Thermal Throttling?

When a CPU or GPU automatically reduces clock speed to prevent overheating. In LLM inference, sustained throttling cuts tokens per second mid-generation — especially in small mini PC enclosures.

Full Explanation

Thermal throttling occurs when a processor's temperature exceeds a safe threshold, causing it to reduce operating frequency to lower heat output. LLM inference is a sustained all-core workload — unlike gaming, which has variable load — meaning mini PCs and laptops are more likely to throttle during inference than during typical use. A mini PC that benchmarks at 12 t/s for a 30-second test may sustain only 8 t/s during a 10-minute document summarization task once thermals saturate.

Why It Matters for Local AI

Always check thermal throttling reviews, not just burst benchmarks, when evaluating mini PCs for local AI. Models with larger cooling solutions (the Geekom AI A7 Max's dual-fan design, for example) sustain performance better than compact single-fan designs. Adding a quality CPU cooler like the Noctua NH-D15 to a desktop build eliminates throttling entirely.

Hardware Relevant to Thermal Throttling

Noctua NH-D15 Premium CPU Cooler

accessory · Check Price on Amazon

Buy on AmazonAffiliate link — no extra cost to you
GEEKOM AI A7 MAX Mini PC (Ryzen 9 7940HS, 16GB DDR5)

mini-pc · Check Price on Amazon · 16 GB Unified · 68 GB/s

Buy on AmazonAffiliate link — no extra cost to you
KAMRUI Pinova P1 Mini PC (AMD Ryzen 4300U)

mini-pc · Check Price on Amazon · 16 GB Unified · 34 GB/s

Buy on AmazonAffiliate link — no extra cost to you

Related Terms