What is LoRA?
Low-Rank Adaptation — a fine-tuning technique that trains a tiny set of adapter weights instead of the full model. Runs on consumer GPUs with as little as 8 GB VRAM.
Full Explanation
LoRA (Low-Rank Adaptation) inserts small trainable matrices into a frozen base model's attention layers, updating only 0.1–1% of total parameters during training. This makes fine-tuning accessible on consumer hardware: a 7B model fine-tune with LoRA requires roughly 10–14 GB VRAM versus 80+ GB for full fine-tuning. The resulting adapter file is typically 50–500 MB and is applied on top of the base model at inference time using tools like Ollama, llama.cpp, or LM Studio.
Why It Matters for Local AI
LoRA is how most hobbyists customize models — teaching a 7B base model a specific writing style, domain vocabulary, or task behavior. An RTX 5070 with 12 GB VRAM is the minimum comfortable GPU for LoRA training on 7B models. The RTX 5080 with 16 GB handles 13B comfortably.
Hardware Relevant to LoRA
gpu · Check Price on Amazon · 12 GB VRAM · 672 GB/s
gpu · Check Price on Amazon · 16 GB VRAM · 960 GB/s
Related Terms
Quantization→
Compressing a model by reducing numeric precision. Q4 = 4-bit (smallest, fastest), Q8 = 8-bit (balanced), FP16 = full precision. Less bits = less VRAM required, slight quality reduction.
GGUF→
The standard file format for quantized LLMs used by llama.cpp and Ollama. Replaces the older GGML format. Stores model weights and metadata in a single portable file.
VRAM→
Video RAM — dedicated memory on a GPU. Determines the maximum model size you can run with full GPU acceleration. Once a model exceeds VRAM, it spills to system RAM over the slow PCIe bus.
llama.cpp→
The foundational C++ inference engine for running quantized LLMs locally. Powers Ollama, LM Studio, and most local AI tools under the hood. Supports CPU, CUDA, ROCm, and Metal.
Ollama→
Free open-source tool for running LLMs locally on macOS, Linux, and Windows. Download a model with a single command. No cloud account required. Supports Llama, Mistral, Qwen, Phi, and more.