Compare Runtimes

Select runtimes to compare side by side. Click chips below to toggle selection.

MetricsafetensorsCUDA RuntimeExLlamaV2
ScoreA-83B-65D47
Type
Executionaotaotaot
Interfaceembeddedsdksdk
Cold Start<1ms100ms1000ms
Memory0MB500MB300MB
Startup<1ms50ms200ms
Isolationprocesshardwareprocess
Maturityproductionproductionstable
LanguagesAnyC, C++, PythonPython, C++, CUDA
LicenseApache-2.0ProprietaryMIT
Links