Skip to content

The risk layer

How dangerous is
this change?

Agents made code cheap; the scarce resource is now trust. Augur reads a diff and your commit history and answers in one number and one word: proceed, review, or block. Deterministic, no API key, no LLM. The same gate for a human PR and an autonomous agent loop.

3
verdicts
proceed · review · block
8
signals
all from git + the filesystem
0
API keys
no LLM in the core, ever
0
third-party deps
AugurKit is Foundation-only

Install

# Homebrew (macOS)
$ brew install corvidlabs/tap/augur

# Verify
$ augur --version
augur 0.4.0
# GitHub Action, drop into any repo
- uses: CorvidLabs/augur@v0
  with:
    range: origin/main..HEAD
    threshold: block

The action installs a prebuilt binary for the runner (macOS universal or Linux x86_64): no Swift toolchain required. On Linux you can also swift build -c release from source.

Why it exists

The scarce resource is trust.

Humans can't hand-review the volume agents produce, and agents have no native sense of "I'm out of my depth here, escalate." Augur is that missing primitive: language-agnostic, CI-agnostic, no API key, no LLM. It turns the senior-engineer instinct (this part is fine, that part needs a careful look) into a deterministic artifact both humans and agents can act on.

Humans triage

Spend review attention on the risky 10% of a 40-file PR. Augur sorts files riskiest-first and names exactly which signal fired on each one.

Agents gate

augur gate exits non-zero so an agent escalates to a human instead of merging blind. Drop it in a CI step or an autonomous loop.

Deterministic & grounded

Every signal derives from git history and the filesystem. No model, no network. The reported calibration says whether a score is guessing or grounded.

The surface

What makes it trustworthy.

Six properties that hold for every run, on every repo, in every language.

Deterministic, not an LLM

No API key, no model call, no token bill. Same diff in, same verdict out, every time. Augur reads the change and your git history; it never phones home.

One number, one verdict

Every range collapses to a risk score 0-100 and a verdict: proceed, review, or block. CI and agent loops gate on the verdict; humans read the score and the riskiest-first file table.

Test-gap aware

Feed it a coverage report with --coverage and risk rises where a change touches untested lines. LCOV, Cobertura, JaCoCo, and Go coverprofiles are all parsed in AugurKit, Foundation-only.

Calibrated to your repo

augur calibrate walks history once and caches a model to .augur/cache.json, so risky means risky *for this codebase*. augur check --cached reuses it for tight agent loops.

Tunable policy

An optional .augur.toml sets sensitivity rules, per-signal weights, verdict thresholds, and [exclude] globs. Absent file means built-in defaults; configuration is strictly additive.

Reports for every consumer

--json for machines, --markdown for a sticky PR comment or job summary, --sarif for GitHub code scanning. Piped and redirected output is always plain text.

How it scores

A transparent prior, sharpened by history.

Scoring has two layers. A heuristic prior with documented weights always applies, even on a brand-new repo. A history calibration then scales the incident signal by how much the repo's own revert and hotfix record backs it. Every assessment reports which band it is in, so you know whether a score is guessing or grounded.

The score, transparently

# Show every contributing signal
$ augur check -v --range main..HEAD

  verdict     [!] REVIEW
  risk        37/100
  confidence  63/100
  calibration prior-only
  (0 incidents / 7 commits)

Calibration bands

  1. prior-only A brand-new repo, or no incident history yet. The transparent heuristic prior applies on its own. The incident signal contributes nothing.
  2. weak Some history, not enough to lean on. The incident signal is scaled down by the calibration confidence factor.
  3. history-backed Deep history with real reverts and hotfixes. The incident signal fires at full strength; the longer augur watches, the sharper it gets.

The blend

Eight signals, one verdict.

Read the signal reference ↗

Each signal is a pure, deterministic function over the change surface and git history, contributing a documented weight to a transparent blend. The prior weights sum to 1.0; no opaque numbers. Tune any of them in .augur.toml.

Report, then gate

Check reports. Gate fails the build.

augur check always exits 0: it reports, it does not gate. augur gate exits non-zero when the verdict meets or exceeds your threshold, so a CI step or agent loop escalates instead of merging blind. Pick the verdict that fails your gate in .augur.toml.

Three verdicts

  • proceed Low risk. Safe to merge or let an agent continue without a second look.
  • review Elevated risk. A human (or a higher-confidence reviewer) should look before it lands.
  • block High risk. gate --threshold block exits non-zero and fails the job.

In CI and on your machine

# Fails the job at or above the threshold
$ augur gate --range main..HEAD --threshold review
augur gate · review (risk 37)
$ echo $?
1

# Below the threshold: passing
$ augur gate --threshold block
$ echo $?
0

Output

One assessment, every consumer.

The same deterministic verdict renders for humans, machines, and code scanners. --json, --markdown, and --sarif are mutually exclusive; piped and redirected output is always plain text.

augur check --json

Stable, sorted-key JSON for agents and tooling. Parse the verdict, score, confidence, and per-signal detail directly.

augur check --markdown

A GitHub-flavored report: verdict heading, confidence line, riskiest-first per-file table, and a sticky-comment marker. Drops into a PR comment or job summary.

augur check --sarif

SARIF 2.1.0 for GitHub code scanning. One result per assessed file, severity mapped from each file's verdict, annotated inline on the PR.

augur check --coverage

Sharpen test-gap with a line-coverage report. LCOV, Cobertura, JaCoCo, or a Go coverprofile, auto-detected at the repo root when present.

Key concepts

The vocabulary; everything else composes from these.

risk score
A 0-100 number summarizing how dangerous a diff looks.
verdict
proceed / review / block, derived from the score and your thresholds.
range
The git span assessed, range-first, e.g. origin/main..HEAD.
gate
augur gate --threshold block exits non-zero at or above a verdict.
signal
One of eight weighted, pure risk inputs over the diff and git history.
calibration
A cached model of the repo's own commit and incident history.

The trust loop

augur scores. attest records.

Augur says how much to trust a change. Attest writes that trust down as a signed, commit-keyed record. Pipe one into the other with --from-augur and CI gains a verifiable provenance trail.

# Score the diff, then record the verdict as provenance
$ augur check --range main..HEAD --json \
  | attest sign --commit HEAD --reviewer agent:ci --from-augur -

# Later, gate on the recorded trust
$ attest verify --commit HEAD || escalate-to-human

Where it ships

Language
Swift 6 (AugurKit + CLI)
Platforms
macOS + Linux
Install
brew install corvidlabs/tap/augur
Action
CorvidLabs/augur@v0 (composite, prebuilt binary)
Latest
v0.4.0
License
MIT

v0.4.0 is live

v0.4.0 shipped 4 days ago (Jun 11, 2026).

New here? The toolchain onboarding wires augur into the full pipeline in about ten minutes.