VRAM Calculator For LLM

LLM VRAM Usage Estimator

LLM VRAM Usage Estimator

Estimated VRAM usage will be displayed here.

Calculation Method & Disclaimer:

  • This calculator provides a rough estimate. Actual VRAM usage can vary.
  • Calculation: VRAM ≈ Model Weights + KV Cache + Fixed Overhead.
  • Model Weights: `Model Size(B) * Bits Per Weight / 8` GB.
  • KV Cache (Approx.): Uses a simplified formula based on size, context, and quantization level. This is a very rough part of the estimate.
  • Fixed Overhead (Approx.): Assumed ~1.0-1.5 GB for software, CUDA, etc.
  • Bit numbers next to quant formats are approximate averages.
  • Real-world usage depends heavily on loader software (llama.cpp, ExLlamav2, vLLM, etc.), batch size, drivers, and specific model implementation.
  • Treat this as a guideline, not an exact figure.

Post a Comment