← LLM Runners tool · /LLAMAC
llama.cpp
The C/C++ LLM inference engine that runs everywhere.
// github
★ 114.4k
last commit · today
moderate CPU only MIT
// readme · what it is
llama.cpp is the upstream that powers Ollama, LM Studio, LocalAI and most consumer LLM apps. Run it directly when you want fine control — custom quantization, exotic hardware targets, or the slimmest possible footprint. Ships a built-in OpenAI-compatible HTTP server.
// deploy notes
Compiling from source unlocks the best perf for your CPU/GPU. Pre-built binaries available.
[ ALTERNATIVE TO ]
// links