# Pre-Analysis Plan Template (for AI-assisted empirical work)

**Source article:** [Building a Pre-Analysis Plan That a Coding Agent Cannot Quietly Ignore](https://tooearlytosay.com/research/methodology/pre-analysis-plan-coding-agent/)
**Pattern:** Three-layer architecture — rule, gate, verification.

A pre-analysis plan ("prereg") for a session where a reasoning agent assists empirical work. The idea is to commit to a single primary specification before the data are touched, force a written justification for any deviation, and leave a third-party-readable record.

Fill this in before the analysis begins. Any departure from the locked primary spec goes in the deviation log, not in the code commit message.

---

## 0. Identifiers

- **Project:** `<your-project>`
- **Session date:** `<YYYY-MM-DD>`
- **Lead analyst:** `<name>`
- **Estimator class:** `<DiD | RD | IV | OLS | other>`

---

## 1. Research question (one sentence)

> `<state the single causal or descriptive question this session is set up to answer>`

---

## 2. Locked primary specification

The single estimator that runs first, before any context-aware decisions can intrude. One row, no menu.

- **Outcome variable:** `<variable name and definition>`
- **Treatment variable:** `<variable name, treatment timing if relevant>`
- **Estimator:** `<e.g., Callaway-Sant'Anna staggered DiD; Calonico-Cattaneo-Titiunik RD>`
- **Sample restrictions:** `<list every inclusion / exclusion criterion>`
- **Controls (fixed):** `<list each control; "none" is a valid answer>`
- **Standard errors:** `<clustering level and reason>`
- **Software / package version:** `<e.g., did 2.1.2 in R>`

---

## 3. Robustness ladder

The set of pre-committed deviations from the primary spec. Each entry is one variant, not a menu of options. List in priority order.

| # | Variant | What changes | Why we want to see it |
|---|---------|--------------|------------------------|
| 1 | `<name>` | `<single change from primary>` | `<what would update if this disagrees with primary>` |
| 2 | `<name>` | `<single change from primary>` | `<...>` |
| 3 | `<name>` | `<single change from primary>` | `<...>` |

---

## 4. Falsification criterion

What pattern in the results would persuade us our primary estimate is not credible? Be specific.

> `<e.g., "Pre-trend coefficients in any of the three years before treatment are statistically distinguishable from zero at the 5 percent level under our chosen clustering.">`

---

## 5. Deviation log

Any departure from the locked primary spec or robustness ladder gets entered here, in plain text, before the new code is run.

| Timestamp | What changed | Why | What we lost or gained by changing it |
|-----------|---------------|------|----------------------------------------|
| `<YYYY-MM-DD HH:MM>` | `<...>` | `<...>` | `<...>` |

If the table is empty at the end of the session, the audit trail reads cleanly. If it is not empty, every entry tells a reviewer exactly what was negotiated and why.

---

## 6. Verification artifact (attach to working paper)

At the end of the session, produce:

- [ ] A Markdown provenance appendix that includes session metadata, decision census, considered specifications, flagged prior-to-abandonment pairs, and the deviation log above.
- [ ] A `prereg.md` snapshot frozen at the start of the session.
- [ ] A diff between `prereg.md` and the final analysis script, with each deviation cross-referenced to the log entry that explains it.

---

## How to use this with a coding agent

1. **Rule layer:** load a system prompt that tells the agent to treat any stated prior as a hypothesis to test, to run the locked primary spec first before any modifications, and to refuse to silently abandon a specification.
2. **Gate layer:** save this file as `prereg.md` in the project root. Tell the agent it must populate the deviation log before running any spec not on the list above.
3. **Verification layer:** at the end of the session, run an audit pass on the session log and produce the provenance appendix described in §6.

A working implementation of this pattern is published as `forking-paths` at [github.com/dphdame/forking-paths](https://github.com/dphdame/forking-paths). The repository ships a universal system prompt, method-specific prereg starters (DiD, RD, IV), and a session-log auditor.

---

*Template extracted from work published on [Too Early To Say](https://tooearlytosay.com).*
*Licensed MIT. Victoria Cholette, 2026.*
