Question 1

What is NVMe SSD?

Accepted Answer

NVMe (Non-Volatile Memory Express) SSDs connect directly to the PCIe bus, delivering sequential read speeds of 5,000–14,000 MB/s versus 500–600 MB/s for SATA SSDs. For local AI, NVMe speed affects model load time — the duration between starting Ollama with a new model and generating the first token. A 4 GB Q4 7B model loads in roughly 1–2 seconds on a PCIe 4.0 NVMe, 3–5 seconds on PCIe 3.0, and 12–18 seconds on SATA SSD. Once loaded, inference speed is determined by VRAM/RAM bandwidth, not storage.

Question 2

Why does NVMe SSD matter for local AI?

Accepted Answer

NVMe matters if you frequently switch between models — a common pattern when running multiple specialized models for different tasks. For users who load one model and keep it running, SATA SSD is adequate. Mini PCs and Macs universally ship with NVMe; verify PCIe generation (4.0 preferred) when comparing storage specs.

What is NVMe SSD?

Full Explanation

Why It Matters for Local AI

Hardware Relevant to NVMe SSD

Related Terms