- Python 100%
|
Some checks failed
fences / fences (push) Failing after 7s
Pass 2 confirmed all six pass-1 findings resolved. New findings, all fixed:
- Untrack the stray peeragent run log accidentally swept into
|
||
|---|---|---|
| .forgejo/workflows | ||
| .research | ||
| example | ||
| kernel | ||
| tools | ||
| .gitignore | ||
| ADOPTING.md | ||
| AGENTS.md | ||
| ard.json | ||
| CATALOGS.md | ||
| CLAUDE.md | ||
| CODEX.md | ||
| GEMINI.md | ||
| HARNESS_ADAPTERS.md | ||
| LICENSE | ||
| README.md | ||
| SPEC.md | ||
| VERSIONING.md | ||
Agentic Research Discipline (ARD)
ARD is an agent-agnostic framework for grounded, verifiable research conducted by AI agents. It codifies a research discipline — an anti-fabrication core, a multi-stage verification stack, and a per-engagement control-space navigated against a non-erodable floor — that addresses a documented inventory of agentic research-failure shapes. It is designed to be adopted by any agent system, not tied to one.
ARD structures a research engagement as a control-space — a floor of always-on checks plus selectable gates, dialed to the engagement and walked as a decision-graph — and measures it against a catalogue of documented ways agent research goes wrong. The catalogue's canonical entry, and the shape ARD fences most tightly, is anchor-and-drift fabrication (GR.1): an agent fetches a real source, uses it correctly for a verifiable detail (a figure, a version, a name), then composes the surrounding attribution (which dataset, which baseline, which version) from training-recall rather than from the source. The fetch makes it look grounded; the context drifts. Anti-fabrication is the always-on floor; the verification stack and the per-engagement controls extend outward from it.
Where to start
- To understand the framework — read SPEC.md (the architecture) alongside CATALOGS.md (the baseline catalogs you populate and extend). These two are self-contained; everything else is concrete support, not a prerequisite.
- To adopt it — ADOPTING.md is a four-tier guide with a minimum-viable quickstart. Partial adoption is first-class — stop at any tier.
- To adapt it to another harness — HARNESS_ADAPTERS.md shows the thin-adapter pattern for Codex, Claude, Gemini, and other agent systems.
- To vendor the reference surface — kernel/ is what every adopter copies as-is: the runnable lint and the artifact templates (reference implementations of the spec, assuming only its §4.2 contract). Re-point the paths and it runs.
- To see it wired up — example/ is one concrete instantiation on the Claude agent system, built around the kernel.
- To check the reasoning — .research/ traces why each commitment is shaped the way it is.
What's here
| Surface | Path | What it is |
|---|---|---|
| Overview (this file) | README.md |
the front door |
| Framework specification — the architecture | SPEC.md | what is invariant across adopters |
| Baseline catalogs — grow by extension | CATALOGS.md | the baseline inventory you populate |
| Adoption guide | ADOPTING.md | four tiers, stop at any |
| Harness adapter guide | HARNESS_ADAPTERS.md | thin adapters for Codex, Claude, Gemini, and others |
| Release manifest — pin + diff this | ard.json | version + the enumerated vendorable surface |
| Vendorable reference surface | kernel/ | discipline bundle, lint, templates, schema, catalog data, conformance |
| Worked example | example/ | one Claude-agent-system instantiation |
| Defensibility trace | .research/ | curated reasoning behind each commitment |
| License | LICENSE | MIT |
| Versioning policy | VERSIONING.md | Semantic Versioning |
License & versioning
MIT — permissive; free to use, modify, and redistribute with attribution. Versioned by Semantic Versioning: currently v0.7.0, pre-1.0, so no stability promise yet — anything may change before 1.0.
Status
v0.7.0, pre-1.0. The architecture is proposed; this repository extracts and neutralizes it into a standalone, deployment-neutral form. v0.2 grew the baseline catalogs (the WM failure-locus, per-class IP profiles); v0.3.0 added a derived consumption contract over the canonical prose — a release manifest, generated catalog data, a verbatim discipline bundle, and a conformance set — so adopters consume updates as data instead of re-deriving from prose; v0.4.0 adds two optional, additive contracts harvested from independent deployments — typed cross-references (the related: frontmatter + a twelve-predicate vocabulary) and a sixth citation-chain check, handle uniqueness; v0.4.1 hardens the reference lint's source_url liveness probe against SSRF (a PATCH — no commitment change); v0.5.0 promotes the decomposition-rationale artifact to a canonical data contract (SPEC §10.6) and adds a proactive-enrichment acquisition discipline that fences candidate-suggestion against recall-fabrication (the AQ.3 failure shape); v0.5.1 adds false-positive suppression (frontmatter, code blocks, inline code, URLs, blockquotes, attestation files) and a --stats audit mode to the reference lint (a PATCH — no commitment change); v0.6.0 is the largest inventory bump — re-engagement & correction-propagation (a refresh change-mode, the reconcile verb + a spot-check job catalog, the PR.3 non-propagated-correction fence, GR.8) and catalog/tooling growth (the source-code source class + the vendored-source acquisition pattern, reference-lint hardening + CI meta-fences, and the one migration item — the substrate_confidence fail-open deprecation); v0.7.0 adds engagement economics & acquisition realism — the AQ.4 inadequate-attempt-blocking shape + acquisition-mode-exhaustion, the forward-looking GR.9 metadata-recall-sourcing shape + metadata-tier source-binding, a model-diversity verification property (partial decorrelation of the correlated-verifier blind-spot), and the decision_relevance value-of-information yield gate (with one migration item — decision_relevance joins the always-present registration shape). See VERSIONING.md §Version history. Pre-1.0 means the catalogs and mechanism details may still move.