Managing multiple local LLMs can be a pain. Each model has different optimal parameters, and each client needs to be individually reconfigured when new models drop.
Llama Matchmaker is a lightweight proxy that sits between your LLM clients and servers, automatically matching each request with model or situation-specific options. It's the single place to configure all your LLMs.
Designed with amazing projects like llama-swap, llama.cpp server, and mlx-lm in mind.
- Sits in front of your LLM backend(s) and rewrites requests using declarative YAML rules.
- Can match on methods, paths, headers, and JSON body parameters.
- Can rewrite request paths, apply defaults, update/delete fields, and even render custom JSON via go templates.
- Automatic hot-reloads when configs change.
- Works with SSL and plain HTTP.
# Install to $GOPATH/bin
go install github.com/spicyneuron/llama-matchmaker@latest
# Grab and edit the example config
curl -L -o example.config.yml https://raw.githubusercontent.com/spicyneuron/llama-matchmaker/main/examples/example.config.yml
# Start the proxy
llama-matchmaker --config example.config.yml
# Configure your clients to point at http://localhost:8081Start from examples/example.config.yml for an annotated, OpenAI-compatible chat setup. At a glance:
- Hierarchy: a
proxyhas orderedroutes; each route has ordered actions (grouped underon_requestandon_response). All matching routes and actions run in order. This layering lets you compose transforms (ex: Ollama → OpenAI compatibility) without duplicating effort. - Proxies live under
proxy:(single map or list). Each haslistenandtarget; optionaltimeoutandssl_cert/ssl_key. - Routes match with case-insensitive regex on method/path.
target_pathrewrites outbound paths.on_requestprocesses JSON bodies; non-JSON bodies pass through untouched. - Reuse proxies, routes, or actions with
include:; paths resolve relative to the file that references them. - Actions:
merge(override fields)default(set if missing)delete(remove keys)template(emit JSON with helpers liketoJson,default,uuid,now,add,mul,dict,index,kindIs)stop(end remaining actions in the current route)
- Passing multiple
--configfiles appends proxies. CLI overrides forlisten/target/timeout/ssl-*only work when exactly one proxy is defined.
# Run
go run main.go --config examples/example.config.yml
# Test
go test ./...
# Build
go build -o bin/ .