Model Memory Estimator
Estimate inference memory requirements for Hugging Face models. Enter a model ID to see total parameters and estimated memory (GB/MB) without downloading the full model.
Suggested models from Hugging Face, or enter any model ID
Default: main
Required for gated or private models
Required — see if the model fits and how many GPUs needed
What is this?
The Model Memory Estimator helps you understand how much memory (RAM or VRAM) a Hugging Face model will need for inference. It works with Transformers, Diffusers, Sentence Transformers, and any model that uses Safetensors weights.
How does it work?
This tool uses the hf-mem package to fetch only the Safetensors metadata from Hugging Face via HTTP Range requests—typically the first ~100KB of each weight file. From that metadata (tensor shapes and data types), it computes total parameter count and estimated memory. No full model download is required.
Browser-only & Privacy
This tool runs entirely in your browser. Nothing is sent to our servers. Your Hugging Face token (if you enter one) is used only between your browser and Hugging Face—it never goes to any other server and is not stored. Only metadata is fetched from Hugging Face; model weights are not downloaded. The model ID you enter is sent to Hugging Face to list and read file metadata.