How it works
Specs in, providers chosen,
code verified.
Merlin reads your specs, picks a provider, runs fledge plugins for tools, and verifies the output. Eight pieces, one agent loop. Your machine, your keys, your code.
Spec-driven development
Specs go in, correct code comes out.
Merlin reads your module specs before writing a line of code. Invariants, public API, and error cases become hard constraints in the system prompt: the same *.spec.md contracts spec-sync enforces in CI.
spec-aware planning
Multi-provider
31 providers, one interface.
Anthropic, OpenAI (11 SKUs incl. gpt-5 / o1 / o3 / 4o), OpenRouter (×5 vendors, one key), Groq, Together, and 11 Ollama Cloud models. Swap providers with a flag; your code and keys stay on your machine.
provider switching
Plugin architecture
Every tool is a plugin you can swap.
Bundled plugins cover filesystem, code search, shell, git, spec-sync, snapshots, runtime checks, media, in-loop sub-agents, and the Discord + Telegram bridges. Write your own in any language; it's just a binary that speaks JSON-lines over the fledge-v1 protocol.
fledge.toml
Sub-agents
Delegate work without filling the parent's context.
subagent-spawn hands a self-contained subtask to a child Merlin process. The child runs its own full loop and returns a compact JSON envelope; the parent's working memory stays small no matter how wide it fans out. Default tier is tool, recursion is capped at depth 2.
subagent-spawn
Media plugins
Agents that can see and hear.
The vision plugin sends images to a local Ollama model and returns descriptions. The voice plugin transcribes audio with Whisper and synthesizes replies. The same agent loop, with new senses; bridges save attachments where these plugins can find them.
vision + voice
Bridges
Run Merlin from Discord and Telegram.
First-class bridges so your team can @mention Merlin or run slash commands from any channel. Reply chains become threaded sessions, live progress shows the active tool, and each channel keeps its own session context. Image + voice attachments route through the media plugins automatically.
bridges/discord
Fledge protocol
Open protocol. You can read every message.
Merlin is built on fledge-v1, a JSON-lines protocol for agent-tool communication. Every tool call, every response, fully inspectable. Stream the same NDJSON over stdout with --output ndjson for scripting.
protocol trace
Verification
verify pass is the rollback anchor.
Specs are the contract; fledge lanes run verify is the success oracle. When a verify loop exhausts its retries, Merlin rolls files back to the last green tree state. Nobody else treats a passing verify as the rollback anchor.
verify loop
Transparency
We publish our benchmarks.
26 test suites, 168 tests, including tool-augmented modes, updated with every release. The live data and per-provider breakdown stay on Merlin's own site.