What is GDDR6?
Previous-generation GPU memory. Lower bandwidth than GDDR7, but paired with larger capacities (e.g., 16GB RX 9060 XT) can offer better model headroom despite lower token speed.
Full Explanation
GDDR6 is the memory standard used in AMD's RDNA 4 generation (RX 9060 XT) and previous NVIDIA cards. At ~512 GB/s on a 128-bit bus, it delivers roughly 24% less bandwidth than GDDR7 but supports larger VRAM capacities at a given price point — the RX 9060 XT ships with 16 GB compared to the RTX 5070's 12 GB. This capacity advantage is meaningful for users who prioritize model size over raw speed.
Why It Matters for Local AI
If you want to run 13B models comfortably without quantization compromises, the RX 9060 XT's 16 GB GDDR6 gives more headroom than the RTX 5070's 12 GB GDDR7 — at the cost of roughly 30 fewer tokens per second.
Hardware Relevant to GDDR6
gpu · Check Price on Amazon · 16 GB VRAM · 288 GB/s
Related Terms
VRAM→
Video RAM — dedicated memory on a GPU. Determines the maximum model size you can run with full GPU acceleration. Once a model exceeds VRAM, it spills to system RAM over the slow PCIe bus.
Memory Bandwidth→
How fast data moves between memory and the processor, measured in GB/s. Tokens per second scales nearly linearly with bandwidth — this is the single most important GPU spec for LLM speed.
GDDR7→
The latest generation of GPU memory (2024+). Significantly higher bandwidth than GDDR6X at the same capacity tier. Used in NVIDIA Blackwell cards (RTX 5070 series).
RDNA 4→
AMD's 2024 GPU architecture. Notable IPC improvement over RDNA 3, improved AI inference throughput, paired with GDDR6 in the RX 9060 XT series.