📚

Recommended Stacks

Pre-configured combinations of launcher + engine + format + quantization for your hardware.

💻 CPU Only Stacks

No GPU - CPU inference only

16GB RAM

16GB RAM

Beginner
Stack
ollama + llama.cpp
Formats
gguf
Quantization
Q4_K_M
💡 3B-7B models. Slow but works
GUI
Stack
lm-studio + llama.cpp
Formats
gguf
Quantization
Q4_K_M
💡 Runs in CPU inference mode

32GB+ RAM

32GB RAM

Beginner
Stack
ollama + llama.cpp
Formats
gguf
Quantization
Q4_K_M
💡 7-13B models. Takes time
GUI
Stack
lm-studio + llama.cpp
Formats
gguf
Quantization
Q4_K_M to Q5_K_M
💡 13B works but slow