KoboldCpp

Easy-to-use AI text generation with llama.cpp backend

F
Score: 38/100
Type
Execution
hybrid
Interface
gui

About

KoboldCpp is a simple one-file way to run various GGML and GGUF models. It combines a llama.cpp backend with an easy-to-use API and web interface. Features include automatic GPU layer detection, context shifting, and story/chat modes.

Performance

1500ms
Cold Start
400MB
Base Memory
300ms
Startup Overhead

Last Verified

Date: Jan 18, 2026
Method: manual test

Manually verified

Languages

C++Python

Details

Isolation
process
Maturity
stable
License
AGPL-3.0

Links