RuntimeDog

AI Tools Directory

Local AI tools directory. Launchers, inference engines, model formats, and GPU backends for running LLMs on your hardware.

Category:All (25)Tools (19)Backends (4)Formats (2)

📦Runtimes

25 items

Name	Role	Type	Exec	Languages	Score	Cold Start	Memory
GGUF GPT-Generated Unified Format for efficient LLM storage	Format	format	aot	Any	—	—	—
safetensors Safe and fast tensor serialization format by Hugging Face	Format	format	aot	Any	—	—	—
Metal Apple's GPU framework for Apple Silicon acceleration	Backend	backend	aot	Swift, Objective-C, C++	—	—	—
CUDA Runtime NVIDIA's parallel computing platform for GPU acceleration	Backend	backend	aot	C, C++, Python	—	—	—
Vulkan Cross-platform GPU API for compute and graphics	Backend	backend	aot	C, C++	—	—	—
llama.cpp LLM inference in C/C++ with minimal dependencies	Engine	engine	aot	C, C++	C+	100ms	50MB
ROCm AMD's open-source GPU computing platform	Backend	backend	aot	C, C++, Python	—	—	—
llamafile Distribute and run LLMs with a single file	Engine	engine	aot	C, C++	C-	500ms	100MB
ONNX Runtime Cross-platform, high performance ML inferencing and training accelerator	Interop	engine	hybrid	Python, C++, C#, ...	C-	500ms	300MB
LLM (Python CLI) Access large language models from the command-line	Tool	tool	hybrid	Python	D	500ms	100MB
Ollama Get up and running with large language models locally	Launcher	launcher	hybrid	Python, JavaScript, Go	D	1000ms	500MB
Candle Minimalist ML framework for Rust with GPU support	Engine	engine	jit	Rust	D	300ms	200MB
ExLlamaV2 Fast inference library for running LLMs locally on NVIDIA GPUs	Engine	engine	aot	Python, C++, CUDA	D	1000ms	300MB
MLX Apple's array framework for machine learning on Apple Silicon	Engine	engine	jit	Python, C++, Swift	D	500ms	200MB
CTransformers Python bindings for GGML models with GPU acceleration	Engine	engine	hybrid	Python, C++	D	800ms	200MB
Open WebUI User-friendly WebUI for LLMs with Ollama/OpenAI support	UI	launcher	hybrid	Python, TypeScript	F	3000ms	500MB
Text Generation Inference Hugging Face's production-ready LLM serving solution	Serving	engine	hybrid	Rust, Python	F	10000ms	2000MB
KoboldCpp Easy-to-use AI text generation with llama.cpp backend	UI	launcher	hybrid	C++, Python	F	1500ms	400MB
MLC LLM Machine Learning Compilation for LLMs	Interop	engine	aot	Python, C++	F	2000ms	500MB
LocalAI Free, open-source OpenAI alternative with local inference	Launcher	launcher	hybrid	Go, Python	F	3000ms	800MB
vLLM High-throughput LLM serving with PagedAttention	Serving	engine	jit	Python	F	5000ms	2000MB
GPT4All Free-to-use, locally running, privacy-aware chatbot	UI	launcher	hybrid	C++, Python	F	2000ms	600MB
Jan Open-source ChatGPT alternative that runs offline	UI	launcher	hybrid	TypeScript, Python	F	2000ms	600MB
LM Studio Discover, download, and run local LLMs with a beautiful GUI	UI	launcher	hybrid	Python	F	2000ms	800MB
Text Generation WebUI Gradio web UI for running Large Language Models	UI	launcher	hybrid	Python	F	5000ms	1000MB