How fast does SDXL / FLUX.1 run on GIGABYTE GeForce RTX 5070 WINDFORCE OC 12G?

SDXL / FLUX.1 generates a 1024×1024 image in 3–5s per 1024×1024 (SDXL) on GIGABYTE GeForce RTX 5070 WINDFORCE OC 12G.

What software do I need to run SDXL / FLUX.1 locally?

You need: Python 3.11, CUDA 12.4, ComfyUI.

Image Generation3.5B

How to run SDXL and FLUX.1 on the NVIDIA RTX 5070 with 12 GB GDDR7 — setup, benchmarks, and VRAM optimization tips.

Generation Time

3–5s per 1024×1024 (SDXL)

Min Memory

8 GB

Software

Python 3.11, CUDA 12.4, ComfyUI

Hardware Used in This Guide

gpu · Check Price on Amazon

Buy on AmazonAffiliate link — no extra cost to you

The fp16 base model is ~6.5 GB — well within the RTX 5070's 12 GB VRAM.

cd models/checkpoints
wget https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0_0.9vae.safetensors

›
Enable --bf16-unet in ComfyUI if you see NaN artifacts — RTX 5000 series prefers bf16 over fp16 for FLUX.
›
12 GB VRAM fits SDXL + refiner in a single pass with --medvram; use --highvram for batched generation.
›
FLUX.1 Dev produces noticeably better text rendering than SDXL — worth the 2× extra generation time.
›
PyTorch 2.6+ includes Blackwell-optimized CUDA kernels — always use the latest nightly for RTX 50-series.

gpu · Check Price on Amazon · 12 GB VRAM

Buy on AmazonAffiliate link — no extra cost to you

Run Llama 3.1 70B on RTX 5070→

How to run Llama 3.1 70B (Q4) on an RTX 5070 12 GB using Ollama — includes VRAM limits, layer offload settings, and expected speed.

Run Stable Diffusion on Mac Mini M4→

How to run SDXL and FLUX on the Mac Mini M4 using Diffusers or ComfyUI — with expected generation times and optimization tips.