🏠

Local LLM Hub

Run LLMs on your own hardware. Find the right launcher, engine, and configuration for your setup.

NVIDIA-first. Mac-strong. Pick your GPU, get your stack.

Quick Start: Mac (16GB RAM (M1/M2/M3 Base))

Beginnergguf

ollama + llama.cpp

Quant: Q4_K_M

7B models (Q4) run smoothly. Metal supported

GUIgguf

lm-studio + llama.cpp

Quant: Q4_K_M

Mac native. Easy to use

Apple Nativemlx

mlx-community + mlx

Quant: 4bit

Apple optimized. Limited model support

4 local LLM tools

Name	Role	Backends	Formats	Score	Install
CUDA Runtime NVIDIA's parallel computing platform for GPU acceleration	Backend	cuda	-	—	🐧🪟
ROCm AMD's open-source GPU computing platform	Backend	rocm	-	—	🐧
Metal Apple's GPU framework for Apple Silicon acceleration	Backend	metal	-	—
Vulkan Cross-platform GPU API for compute and graphics	Backend	vulkan	-	—