← all categories section · /llm-runner
[ CATEGORY ]
LLM Runners
Run open-weight language models on your own hardware. The base layer of any self-hosted AI stack.
LLM Runners CPU
llama.cpp
The C/C++ LLM inference engine that runs everywhere.
RAM
vCPU
MIT
LLM Runners CPU
LocalAI
Drop-in OpenAI-compatible API for local models.
RAM
vCPU
MIT
LLM Runners CPU
Ollama
The simplest way to run open-weight LLMs locally.
RAM
vCPU
MIT
LLM Runners
vLLM
High-throughput LLM serving for production GPU workloads.
VRAM
RAM
Apache-2.0