The risk layer
How dangerous is
this change?
Agents made code cheap; the scarce resource is now trust.
Augur reads a diff and your commit history and answers in one number
and one word: proceed,
review, or
block. Deterministic, no API
key, no LLM. The same gate for a human PR and an autonomous agent loop.
augur · main..HEAD
verdict [!] REVIEW
risk [####### ] 37/100
confidence 63/100
calibration prior-only
files (2), riskiest first:
! 36 src/auth/token.swift
· sensitivity: auth
! 35 db/001_secrets.sql
· sensitivity: secrets
→ an agent should request human review
- 3
- verdicts
- proceed · review · block
- 8
- signals
- all from git + the filesystem
- 0
- API keys
- no LLM in the core, ever
- 0
- third-party deps
- AugurKit is Foundation-only
Install
$ brew install corvidlabs/tap/augur
# Verify
$ augur --version
augur 0.4.0
- uses: CorvidLabs/augur@v0
with:
range: origin/main..HEAD
threshold: block
The action installs a prebuilt binary for the runner (macOS universal or
Linux x86_64): no Swift toolchain required. On Linux you can also
swift build -c release from source.
Why it exists
The scarce resource is trust.
Humans can't hand-review the volume agents produce, and agents have no native sense of "I'm out of my depth here, escalate." Augur is that missing primitive: language-agnostic, CI-agnostic, no API key, no LLM. It turns the senior-engineer instinct (this part is fine, that part needs a careful look) into a deterministic artifact both humans and agents can act on.
Humans triage
Spend review attention on the risky 10% of a 40-file PR. Augur sorts files riskiest-first and names exactly which signal fired on each one.
Agents gate
augur gate exits non-zero so an agent escalates to a human instead of merging blind. Drop it in a CI step or an autonomous loop.
Deterministic & grounded
Every signal derives from git history and the filesystem. No model, no network. The reported calibration says whether a score is guessing or grounded.
The surface
What makes it trustworthy.
Six properties that hold for every run, on every repo, in every language.
Deterministic, not an LLM
No API key, no model call, no token bill. Same diff in, same verdict out, every time. Augur reads the change and your git history; it never phones home.
One number, one verdict
Every range collapses to a risk score 0-100 and a verdict: proceed, review, or block. CI and agent loops gate on the verdict; humans read the score and the riskiest-first file table.
Test-gap aware
Feed it a coverage report with --coverage and risk rises where a change touches untested lines. LCOV, Cobertura, JaCoCo, and Go coverprofiles are all parsed in AugurKit, Foundation-only.
Calibrated to your repo
augur calibrate walks history once and caches a model to .augur/cache.json, so risky means risky *for this codebase*. augur check --cached reuses it for tight agent loops.
Tunable policy
An optional .augur.toml sets sensitivity rules, per-signal weights, verdict thresholds, and [exclude] globs. Absent file means built-in defaults; configuration is strictly additive.
Reports for every consumer
--json for machines, --markdown for a sticky PR comment or job summary, --sarif for GitHub code scanning. Piped and redirected output is always plain text.
How it scores
A transparent prior, sharpened by history.
Scoring has two layers. A heuristic prior with documented weights always applies, even on a brand-new repo. A history calibration then scales the incident signal by how much the repo's own revert and hotfix record backs it. Every assessment reports which band it is in, so you know whether a score is guessing or grounded.
The score, transparently
$ augur check -v --range main..HEAD
verdict [!] REVIEW
risk 37/100
confidence 63/100
calibration prior-only
(0 incidents / 7 commits)
Calibration bands
- prior-only A brand-new repo, or no incident history yet. The transparent heuristic prior applies on its own. The incident signal contributes nothing.
- weak Some history, not enough to lean on. The incident signal is scaled down by the calibration confidence factor.
- history-backed Deep history with real reverts and hotfixes. The incident signal fires at full strength; the longer augur watches, the sharper it gets.
The blend
Eight signals, one verdict.
Each signal is a pure, deterministic function over the change surface and
git history, contributing a documented weight to a transparent blend. The
prior weights sum to 1.0;
no opaque numbers. Tune any of them in
.augur.toml.
- sensitivity0.2024
Touches secrets, auth, crypto, payments, migrations, infra, CI, or dependency manifests.
- test-gap0.1656
Code changed with no test in the changeset, or, with coverage, the uncovered fraction of changed lines. Never fires on docs.
- churn0.1380
Hot files that change constantly are statistically fragile.
- coupling0.1196
A file's usual co-change partner is absent from the change.
- diff-shape0.1104
Large single-file edits are harder to review well.
- ownership0.0920
Bus-factor (single author) or diffuse ownership (many authors).
- incident0.0920
The file's own history of reverts and hotfixes, scaled by calibration.
- codeowners0.0800
A changed file with no declared owner in the repo's CODEOWNERS.
Report, then gate
Check reports. Gate fails the build.
augur check always exits
0: it reports, it does not
gate. augur gate exits
non-zero when the verdict meets or exceeds your threshold, so a CI step or
agent loop escalates instead of merging blind. Pick the verdict that fails
your gate in .augur.toml.
Three verdicts
- proceed Low risk. Safe to merge or let an agent continue without a second look.
- review Elevated risk. A human (or a higher-confidence reviewer) should look before it lands.
- block High risk.
gate --threshold blockexits non-zero and fails the job.
In CI and on your machine
$ augur gate --range main..HEAD --threshold review
augur gate · review (risk 37)
$ echo $?
1
# Below the threshold: passing
$ augur gate --threshold block
$ echo $?
0
Output
One assessment, every consumer.
The same deterministic verdict renders for humans, machines, and code
scanners. --json,
--markdown, and
--sarif are mutually
exclusive; piped and redirected output is always plain text.
augur check --json
Stable, sorted-key JSON for agents and tooling. Parse the verdict, score, confidence, and per-signal detail directly.
augur check --markdown
A GitHub-flavored report: verdict heading, confidence line, riskiest-first per-file table, and a sticky-comment marker. Drops into a PR comment or job summary.
augur check --sarif
SARIF 2.1.0 for GitHub code scanning. One result per assessed file, severity mapped from each file's verdict, annotated inline on the PR.
augur check --coverage
Sharpen test-gap with a line-coverage report. LCOV, Cobertura, JaCoCo, or a Go coverprofile, auto-detected at the repo root when present.
Key concepts
The vocabulary; everything else composes from these.
- risk score
- A
0-100number summarizing how dangerous a diff looks. - verdict
proceed/review/block, derived from the score and your thresholds.- range
- The git span assessed, range-first, e.g.
origin/main..HEAD. - gate
augur gate --threshold blockexits non-zero at or above a verdict.- signal
- One of eight weighted, pure risk inputs over the diff and git history.
- calibration
- A cached model of the repo's own commit and incident history.
The trust loop
augur scores. attest records.
Augur says how much to trust a change. Attest writes that trust down
as a signed, commit-keyed record. Pipe one into the other with
--from-augur and CI gains a
verifiable provenance trail.
$ augur check --range main..HEAD --json \
| attest sign --commit HEAD --reviewer agent:ci --from-augur -
# Later, gate on the recorded trust
$ attest verify --commit HEAD || escalate-to-human
Where it ships
- Language
- Swift 6 (AugurKit + CLI)
- Platforms
- macOS + Linux
- Install
- brew install corvidlabs/tap/augur
- Action
- CorvidLabs/augur@v0 (composite, prebuilt binary)
- Latest
- v0.4.0
- License
- MIT
v0.4.0 is live
v0.4.0 shipped 4 days ago (Jun 11, 2026).
New here? The toolchain onboarding wires augur into the full pipeline in about ten minutes.