Pre-configured combinations of launcher + engine + format + quantization for your hardware.
NVIDIA GPU users - largest user base, clear VRAM-based recommendations
8GB VRAM
curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3.2:3bDownload from lmstudio.ai12GB VRAM
curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3.1:8b16GB VRAM
curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3.1:8bpip install vllm
vllm serve meta-llama/Llama-3.1-8B-Instruct24GB VRAM
curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3.1:70b-instruct-q4_K_Mpip install vllm
vllm serve meta-llama/Llama-3.1-70B-Instruct --tensor-parallel-size 1