📚

Recommended Stacks

Pre-configured combinations of launcher + engine + format + quantization for your hardware.

🟢 NVIDIA 🍎 Mac (Apple Silicon)💻 CPU Only 🔴 AMD

🍎 Mac (Apple Silicon) Stacks

Apple Silicon users - unified memory advantage, MLX ecosystem

16GB RAM (M1/M2/M3 Base)

16GB RAM

Beginner

Stack

ollama + llama.cpp

Formats

gguf

Quantization

Q4_K_M

Install

brew install ollama
ollama pull llama3.2:3b

💡 7B models (Q4) run smoothly. Metal supported

GUI

Stack

lm-studio + llama.cpp

Formats

gguf

Quantization

Q4_K_M

💡 Mac native. Easy to use

Apple Native

Stack

mlx-community + mlx

Formats

mlx

Quantization

4bit

Install

pip install mlx mlx-lm
mlx_lm.generate --model mlx-community/Llama-3.2-3B-Instruct-4bit

💡 Apple optimized. Limited model support

32GB RAM (M1/M2/M3 Pro, M2/M3 Max Base)

32GB RAM

Beginner

Stack

ollama + llama.cpp

Formats

gguf

Quantization

Q4_K_M to Q5_K_M

Install

brew install ollama
ollama pull llama3.1:8b

💡 8B runs smoothly. 13B (Q4) possible

GUI

Stack

lm-studio + llama.cpp

Formats

gguf

Quantization

Q5_K_M

💡 Comfortable even with 13B models

Apple Native

Stack

mlx-community + mlx

Formats

mlx

Quantization

4bit-8bit

Install

pip install mlx mlx-lm

💡 8B models run fastest

64GB RAM (M2/M3 Max, M2/M3 Ultra Base)

64GB RAM

Beginner

Stack

ollama + llama.cpp

Formats

gguf

Quantization

Q5_K_M to Q6_K

Install

brew install ollama
ollama pull llama3.1:70b-instruct-q4_K_M

💡 70B (Q4) works!

GUI

Stack

lm-studio + llama.cpp

Formats

gguf

Quantization

Q6_K

💡 Can handle 70B models

Apple Native

Stack

mlx-community + mlx

Formats

mlx

Quantization

4bit

Install

pip install mlx mlx-lm
mlx_lm.generate --model mlx-community/Llama-3.1-70B-Instruct-4bit

💡 70B 4bit fastest on Mac

128GB+ RAM (M2/M3 Ultra Max)

128GB RAM

Beginner

Stack

ollama + llama.cpp

Formats

gguf

Quantization

💡 70B Q8 runs easily

GUI

Stack

lm-studio + llama.cpp

Formats

gguf

Quantization

💡 70B with highest quality quantization

Apple Native

Stack

mlx-community + mlx

Formats

mlx

Quantization

8bit or FP16

💡 70B FP16 within reach