🏠

Local LLM Hub

Run LLMs on your own hardware. Find the right launcher, engine, and configuration for your setup.

NVIDIA-first. Mac-strong. Pick your GPU, get your stack.

Quick Start: NVIDIA (8GB VRAM (RTX 3060/3070, RTX 4060))

Beginnergguf
ollama + llama.cpp
Quant: Q4_K_M
7B models (Q4) run smoothly. 13B is challenging
GUIgguf
lm-studio + llama.cpp
Quant: Q4_K_M
Easy GUI. Simple model management
Powergguf, gptq
text-generation-webui + llama.cpp
Quant: Q4_K_M or GPTQ-4bit
For users who need fine-grained control
2 local LLM tools
NameRoleBackendsFormatsScoreInstall
GGUF
GPT-Generated Unified Format for efficient LLM storage
Formatcuda, metal, rocm...-
safetensors
Safe and fast tensor serialization format by Hugging Face
Formatcuda, metal, rocm...-