Llama Matchmaker

Managing multiple local LLMs can be a pain. Each model has different optimal parameters, and each client needs to be individually reconfigured when new models drop.

Llama Matchmaker is a lightweight proxy that sits between your LLM clients and servers, automatically matching each request with model or situation-specific options. It's the single place to configure all your LLMs.

Designed with amazing projects like llama-swap, llama.cpp server, and mlx-lm in mind.

What It Does

Sits in front of your LLM backend(s) and rewrites requests using declarative YAML rules.
Can match on methods, paths, headers, and JSON body parameters.
Can rewrite request paths, apply defaults, update/delete fields, and even render custom JSON via go templates.
Automatic hot-reloads when configs change.
Works with SSL and plain HTTP.

Quickstart

# Install to $GOPATH/bin
go install github.com/spicyneuron/llama-matchmaker@latest

# Grab and edit the example config
curl -L -o example.config.yml https://raw.githubusercontent.com/spicyneuron/llama-matchmaker/main/examples/example.config.yml

# Start the proxy
llama-matchmaker --config example.config.yml

# Configure your clients to point at http://localhost:8081

Configuration & Behavior

Start from examples/example.config.yml for an annotated, OpenAI-compatible chat setup. At a glance:

Hierarchy: a proxy has ordered routes; each route has ordered actions (grouped under on_request and on_response). All matching routes and actions run in order. This layering lets you compose transforms (ex: Ollama → OpenAI compatibility) without duplicating effort.
Proxies live under proxy: (single map or list). Each has listen and target; optional timeout and ssl_cert/ssl_key.
Routes match with case-insensitive regex on method/path. target_path rewrites outbound paths. on_request processes JSON bodies; non-JSON bodies pass through untouched.
Reuse proxies, routes, or actions with include:; paths resolve relative to the file that references them.
Actions:
- merge (override fields)
- default (set if missing)
- delete (remove keys)
- template (emit JSON with helpers like toJson, default, uuid, now, add, mul, dict, index, kindIs)
- stop (end remaining actions in the current route)
Passing multiple --config files appends proxies. CLI overrides for listen/target/timeout/ssl-* only work when exactly one proxy is defined.

Development

# Run
go run main.go --config examples/example.config.yml

# Test
go test ./...

# Build
go build -o bin/ .

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
config		config
examples		examples
integration		integration
logger		logger
proxy		proxy
.gitignore		.gitignore
README.md		README.md
go.mod		go.mod
go.sum		go.sum
main.go		main.go
main_test.go		main_test.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Llama Matchmaker

What It Does

Quickstart

Configuration & Behavior

Development

About

Uh oh!

Releases

Packages

Languages

spicyneuron/llama-matchmaker

Folders and files

Latest commit

History

Repository files navigation

Llama Matchmaker

What It Does

Quickstart

Configuration & Behavior

Development

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages