Skip to content
Quill app icon

Local-first voice dictation

Talk to your computer.
Words appear.

Quill captures your mic, transcribes with Whisper on-device, polishes with embedded llama.cpp when you ask for it, and pastes into whatever field you have focused. F8 is quick raw dictation. F9 is enhanced. Everything runs on your machine.

Request alpha access How it works → v0.1.0-alpha.18 · invite-only

Universal arm64 + x86_64 macOS .dmg, signed and notarized. Linux x86_64 .AppImage and .deb also available. Closed-alpha, proprietary.

Quill setup view showing microphone, accessibility, Whisper model, and embedded polish readiness checks

Local-first means local-first.

No transcript leaves the machine by default. The polish pass runs against a SHA256-pinned model the app manages itself. Any remote provider is opt-in per-config and surfaces a startup warning. The daemon never logs raw transcripts at the default tracing level. That is the contract.

Record, transcribe, polish, paste

Four stages, all on-device.

Each stage is a separate Rust crate, so raw and enhanced dictation stay predictable. Hold the hotkey, talk, release. The daemon does the rest without a single byte leaving your machine on the default path.

1
F8 / F9 down

Daemon opens the mic via cpal. webrtc-vad trims silence and auto-stops.

2
Whisper

On-device transcription with whisper-rs. Metal on Apple Silicon, CPU elsewhere.

3
Polish

F9 only: embedded llama.cpp against the verified GGUF, with a raw-text fallback.

4
Paste

Clipboard paste, or the macOS Accessibility API, into the focused field.

First dictation, from a terminal

Drive it live over IPC

# Cheap liveness probe
$ quill ping
pong

# Live-switch a hotkey's paste strategy, no restart
$ quill set-inject-mode enhanced clipboard

# Wayland: let the compositor own the keybind
$ quill press quick
$ quill release quick

Seven polish styles

Pick how the polish pass rewrites you.

On the F9 enhanced path, the embedded model rewrites the raw transcript through one of seven prompt templates. casual is the default. no-polish is a passthrough: Whisper text, untouched. Set it in Settings or via the polish_template field.

casual

default

Light cleanup. Conversational tone, contractions kept.

Slack, quick notes, PR comments

formal

Proper grammar, no contractions, business voice.

Email bodies, reports, customer replies

technical

Preserve technical terms, code, and command names verbatim.

Code review, bug reports, eng chat

bullets

Restructure continuous speech into a bulleted list.

Meeting notes, action items, stand-ups

concise

Shorter, fewer words, filler and hedging removed.

Commit messages, status updates

email

Format as an email body. Greeting, body, and sign-off when context fits.

Dictating email replies

no-polish

Pass the raw Whisper transcript through unchanged.

Fastest path, no LLM pass at all

Key concepts

The vocabulary; the daemon composes everything else from these.

F8 (quick)
The fast path, on by default: capture, transcribe, paste. No polish pass, lowest latency.
F9 (enhanced)
Adds the local polish pass before paste. Recommended but off until you set a binding for it.
polish style
One of seven prompt templates: casual, formal, technical, bullets, concise, email, no-polish.
inject mode
Per-hotkey: clipboard, clipboard-only, or keystroke. Live-switch with quill set-inject-mode.
polish backend
embedded by default (bundled llama.cpp). remote is the opt-in escape hatch and warns on non-loopback hosts.
daemon
The long-running process that owns the global hotkey via rdev and drives the whole pipeline over IPC.

The privacy contract

Dictate without trusting a cloud.

Quill is built so the private path is the default path. The embedded polish model is pinned and verified; remote is the loud exception, not the rule. Here is exactly what that buys you.

  • Audio never leaves the machine. Whisper runs on-device. There is no transcript network round-trip on the default path.
  • Telemetry and crash reports OFF. Both are opt-in and default OFF. Report-a-problem attaches only redacted log tails.
  • No raw transcript logging. The daemon refuses to log raw transcripts at the default tracing level. That is enforced, not advised.
  • SHA256-pinned polish model. The embedded GGUF (Qwen3 4B Q4_K_M) is verified against a pinned SHA256 before first use.
  • Remote is explicit and loud. Pointing polish at a remote endpoint is per-config opt-in and fires a startup warning for non-loopback hosts.
  • Proprietary, invite-only alpha. Closed-alpha binaries are gated through Discord during dogfooding, not anonymous download.

The pinned polish model

# quill-core model registry
model = "Qwen3 4B Q4_K_M"
repo  = "Qwen/Qwen3-4B-GGUF"
file  = "Qwen3-4B-Q4_K_M.gguf"
sha256 = 7485fe6f11af...

# Verified before first use. No Ollama server.
backend = "embedded"

Polish backends: embedded (default), system, remote. Only remote leaves the machine, and only after you opt in.

What it looks like

Paper-like sheets instead of glowing panels. The setup view runs once. The live view is where you spend your time. Settings is one keystroke away.

Quill live view with a list of past transcripts
Idle. Hold the hotkey to dictate. Past transcripts stack below; nothing leaves the machine.
Quill recording state showing a live partial transcript and an audio meter
Recording. The live transcript gets the main canvas while you talk. The meter shows the mic is hearing you.
Quill settings panel showing capture, polish, and output configuration
Settings. Pick your hotkeys, polish style, model, paste mode, and paper-family theme. Persists to TOML.

What you get

Local-first by default

Audio never leaves your machine. Whisper runs on-device; enhanced dictation uses embedded llama.cpp against a verified GGUF in Quill's model cache. No API keys, no transcript network round-trip. Crash reports and usage telemetry both default OFF. The daemon refuses to log raw transcripts at the default tracing level.

Whisper with Metal acceleration

Speech-to-text via whisper-rs (whisper.cpp under the hood). Metal on Apple Silicon, CPU fallback everywhere else. Curated picker covers base.en, base, small.en, small, medium.en, medium. Quill manages the downloads into ~/.cache/quill/models/.

Enhanced polish without Ollama

After Whisper transcribes, a local Qwen3 4B Q4_K_M GGUF cleans up filler words, fixes punctuation, and disambiguates homophones through embedded llama.cpp. Pick one of seven styles. No Ollama install, no local HTTP server. A custom Ollama-compatible endpoint stays an explicit opt-in escape hatch.

F8 raw, F9 enhanced

F8 is the fast path: transcribe and paste. F9 adds the local polish pass before paste, and is recommended but off until you bind it. Both use the same hold-talk-release loop, and both leave the final text on the clipboard if automatic paste fails.

Pastes into any focused field

Quill uses the clipboard paste path by default and can use macOS Accessibility for richer focused-field writes (kAXSelectedText, falling back to keystrokes when a field is not AX-writable). Works in your editor, your browser, your terminal, your chat app.

Click-to-capture hotkey picker

Open Settings, click the hotkey field, press the binding you want: bare modifiers, function keys, or full chords like Cmd+Shift+Space, with reserved-combo warnings. The daemon picks changes up live over IPC: no restart, no TOML editing.

First-run model setup

First launch walks through mic and Accessibility permissions, downloads the Whisper model and embedded polish GGUF, then verifies the GGUF before use. If something breaks later, WHAT / WHY / DO error banners with stable IDs explain what happened, why, and what to do.

In-app updater

Background download, signature verify (macOS spctl), install on next launch. A 'What's new' card surfaces release highlights when the version bumps, sourced from a TOML asset baked into the binary, with no network call.

Paper-first app shell

Quill defaults to large Literata reading type, no-glow layered sheets, a live transcript-first recording layout, and AAA-checked Paper, Light, Dark, OLED, Tan, Brown, Blue, Red, Pink, Green, and Grey themes.

Stack

Pure Rust workspace, one crate per pipeline stage, plus a thin iced GUI for the app shell.

Layer Crate Tool
Audio capture + VAD quill-audio cpal + webrtc-vad
Speech-to-text quill-stt whisper-rs (Metal on macOS)
LLM polish quill-polish Embedded llama.cpp + verified GGUF (BYO endpoint optional)
Text insertion quill-inject arboard clipboard · enigo keystroke · macOS Accessibility
Hotkey + pipeline quill-daemon rdev + tokio
CLI quill-cli clap
GUI app quill-app iced

Status

v0.1.0-alpha.18 is shipping on macOS and Linux. What works today, what we are hardening, what is still planned. Listed honestly.

Working today
  • Universal arm64 + x86_64 macOS .dmg, signed, notarized, stapled
  • Linux x86_64 .AppImage and .deb packages
  • Whisper STT with Metal on macOS; curated model picker
  • Embedded llama.cpp polish with verified Qwen3 4B Q4_K_M GGUF
  • F8 quick raw dictation by default; F9 recommended for enhanced (off until you set it)
  • Seven polish styles: casual, formal, technical, bullets, concise, email, no-polish
  • Click-to-capture hotkey picker with reserved-combo warnings
  • Clipboard paste by default, with macOS Accessibility for richer fields
  • 30-second first-run tour, WHAT / WHY / DO error banners
  • In-app updater with background download + signature verify
  • Crash reporter + usage telemetry: both opt-in, default OFF
In flight
  • Polish quality tuning for the embedded Qwen path
  • Long-recording UI and clearer enhance progress feedback
  • App shell lifecycle and icon unification polish
Planned
  • Windows packaging
  • ARM64 Linux build
  • Wayland-aware injection on Linux
  • Per-app custom polish prompts

Built on the CorvidLabs spine

The same lanes, specs, and agent contracts as every other CorvidLabs project.