What is eGPU?
External GPU — a discrete GPU connected via Thunderbolt to a laptop or mini PC. Enables GPU-accelerated LLM inference on machines without a built-in GPU slot.
Full Explanation
An eGPU enclosure houses a full-size PCIe GPU and connects to a host computer via Thunderbolt 3/4/5. The GPU handles inference while the host CPU manages the OS and applications. Thunderbolt 5 at 120 Gbps is essential for eGPU LLM use — older Thunderbolt 3/4 at 40 Gbps creates a bandwidth bottleneck that can halve effective tokens per second compared to native PCIe. Apple Silicon Macs are not compatible with eGPU for GPU compute (only display output); eGPU is a Windows/Linux strategy.
Why It Matters for Local AI
eGPU is the path to GPU-accelerated LLM inference on small-form-factor Windows PCs or mini ITX builds that lack a PCIe x16 slot. With Thunderbolt 5, a mini PC like the Geekom A6 can drive an RTX 5070 externally with acceptable bandwidth overhead — expanding its AI capability significantly.
Hardware Relevant to eGPU
accessory · Check Price on Amazon · 80 GB/s
accessory · Check Price on Amazon
mini-pc · Check Price on Amazon · 32 GB Unified · 68 GB/s
Related Terms
Thunderbolt 5→
Intel's latest Thunderbolt standard — up to 120 Gbps bandwidth. Enables high-bandwidth eGPU enclosures, fast NVMe storage, and 8K display output from compact AI machines.
PCIe→
Peripheral Component Interconnect Express — the bus connecting a discrete GPU to the motherboard. PCIe 4.0 or 5.0 needed for fast model offloading when VRAM is exceeded.
VRAM→
Video RAM — dedicated memory on a GPU. Determines the maximum model size you can run with full GPU acceleration. Once a model exceeds VRAM, it spills to system RAM over the slow PCIe bus.
Tensor Cores→
Specialized hardware units on NVIDIA GPUs designed for matrix multiplication — the core math operation in neural networks. 5th-gen Tensor Cores (Blackwell) are significantly faster than 4th-gen (Ada Lovelace) for AI inference.