AI launchers and inference engines for local LLM deployment.
| Name | Role | Type | Exec | Languages | Score | Cold Start | Memory |
|---|---|---|---|---|---|---|---|
| Text Generation Inference Hugging Face's production-ready LLM serving solution | Serving | engine | hybrid | Rust, Python | F | 10000ms | 2000MB |
| vLLM High-throughput LLM serving with PagedAttention | Serving | engine | jit | Python | F | 5000ms | 2000MB |