Hardware & Architecture

What is PCIe?

Peripheral Component Interconnect Express — the bus connecting a discrete GPU to the motherboard. PCIe 4.0 or 5.0 needed for fast model offloading when VRAM is exceeded.

Full Explanation

PCIe (Peripheral Component Interconnect Express) is the high-speed serial interface connecting discrete GPUs to the CPU and system memory. PCIe 4.0 x16 provides ~32 GB/s of bidirectional bandwidth; PCIe 5.0 doubles this to ~64 GB/s. For LLM inference within VRAM, PCIe speed is irrelevant. It only matters when models overflow VRAM and must stream layers from system RAM — at which point PCIe bandwidth (32–64 GB/s) becomes the bottleneck rather than GPU memory bandwidth (672 GB/s for GDDR7).

Why It Matters for Local AI

If your model fits in VRAM, PCIe generation is irrelevant for inference speed. If you're intentionally offloading some layers to CPU RAM (e.g., running a 70B model on a 12 GB GPU), PCIe 4.0+ minimizes the penalty for those offloaded layers.

Hardware Relevant to PCIe

GIGABYTE GeForce RTX 5070 WINDFORCE OC 12G

gpu · Check Price on Amazon · 12 GB VRAM · 672 GB/s

Buy on AmazonAffiliate link — no extra cost to you
GIGABYTE Radeon RX 9060 XT GAMING OC 16G

gpu · Check Price on Amazon · 16 GB VRAM · 288 GB/s

Buy on AmazonAffiliate link — no extra cost to you

Related Terms