own/metal
← LLM Runners tool · /OLLAMA

Ollama

The simplest way to run open-weight LLMs locally.

// github

★ 173.0k

last commit · today

moderate CPU only MIT

// readme · what it is

Ollama bundles model weights, configuration, and an OpenAI-compatible API into single-command installs. It's the path of least resistance for self-hosting Llama, Mistral, Qwen, Phi and Gemma models — `ollama run llama3.1` is genuinely all it takes. Runs on CPU; auto-detects and uses NVIDIA, AMD or Apple Silicon GPUs when available.

// deploy notes

Single binary install on Linux/macOS/Windows. Docker image runs unmodified. CPU-only is fine for 7B Q4 models; GPU recommended for anything bigger.