Literature review

Victoria Cholette

AI for Applied Researchers · Step 1 of 5

Updated July 21, 2026

Literature review

This step is the pattern we use before an independent reproduction or extension. It produces a frozen scoping note: the source study, its estimand and reported estimates, and the criteria the new analysis must reproduce without tuning for numerical agreement.

The problem this step solves

Before we write code for an independent reproduction, we state exactly what the source study estimated: population, outcome, sample period, estimator, and reported value. We freeze those criteria before running the new analysis. Agreement or disagreement then becomes evidence to investigate. A nearby coefficient alone does not show that the code is right, and a different coefficient is not automatically a failure.

A precise specification is what lets an agent do useful work. Here, it fixes the source study, estimand, outcome scale, population, period, estimator, and comparison criteria before the new result exists.

The Supplemental Nutrition Assistance Program Broad-Based Categorical Eligibility case illustrates the cost of ambiguity: several estimators answer related but different questions, so matching one coefficient cannot validate the reproduction by itself.

When to use this step, and when not to

This step earns its place whenever we reproduce, extend, or compare with published work. The signal is a prior design or result whose population, outcome, estimator, and sample can be recorded before code is written.

We can skip it when no prior design answers the same question, or when the task is descriptive data engineering with no source estimate to reproduce.

Decision rule: use this step only when the source population, outcome scale, period, and estimator can be stated and frozen before any reproduction output is inspected. If one of those elements is unresolved, stop at scoping rather than tune code toward a published number.

Inputs required

A precisely stated research question, population, and outcome variable.
Access to the candidate papers, ideally full text or at least abstracts with reported point estimates.
A short list of the methods the reproduction may touch, so extraction has a defined scope.
A versioned location for the frozen scoping note and a rule that changes after the freeze are logged rather than silently overwritten.

The AI-assisted move

We hand an artificial intelligence (AI) assistant the research question and candidate papers. We ask it to do three concrete things and nothing broader.

First, identify the source evaluation that most directly matches the research question and quote its reported estimates verbatim, with the outcome scale attached. Second, list every estimator and diagnostic the paper relies on and cite the original methods paper for each one. Third, record the values needed for comparison after the study population and estimator are independently reproduced.

In the Supplemental Nutrition Assistance Program Broad-Based Categorical Eligibility case, the assistant builds the methods-to-source map and source-value record before any reproduction is run. The project figures remain in Provenance because they are audit references, not model targets.

The assistant extracts and organizes. We verify every number against the paper, decide which design is comparable, and freeze the note. Reported values travel as audit references, not optimization goals.

Copy-paste protocol

Here is the prompt we run at the start of an independent reproduction. The point is to force verbatim extraction, frozen comparison criteria, and per-method citation.

You are helping me scope an independent reproduction in applied microeconometrics.

RESEARCH QUESTION:
"What is the effect of [POLICY] on [OUTCOME]?"

CANDIDATE PAPERS (full text or abstracts pasted below):
[PASTE PAPERS OR ABSTRACTS]

Do exactly three things. Do not summarize broadly.

1. SOURCE STUDY
   - Name the paper that most directly evaluates this question.
   - Quote its headline point estimate(s) verbatim, and state the
     outcome scale for each (e.g. "log per capita", "percentage
     points", "take-up rate").
   - If it reports more than one estimator, give each estimate
     and label which estimator produced it.

2. METHODS-TO-SOURCE MAP
   - List every estimator and every diagnostic the paper relies on.
   - For each one, cite the original methods paper it traces to
     (author, year, journal). Do not cite the applied paper as the
     source of a method it merely uses.

3. REPRODUCTION RECORD
   - List every reported value needed to compare designs after
     independently reproducing the study population and estimator
     (point estimates, decomposition weight shares, channel
     decompositions, sample sizes).
   - For each number, give the exact figure and one sentence on
     what it measures.

Constraints:
- Quote numbers exactly as reported. Do not round or infer.
- If a number or citation is not in the text I gave you, write
  "NOT IN PROVIDED TEXT". Do not supply it from memory.
- Flag any estimate where the outcome scale is ambiguous.
- Do not treat numerical proximity as proof of reproduction.

STOP after item 3. Do not draft code or choose a preferred result.

Failure check and validation

Failure condition: one quoted estimate, outcome scale, or citation cannot be found at the recorded location in the source. Drop that item and do not freeze the note.

Pass condition: every retained estimate appears verbatim at the stated scale, every method citation points to the source actually used, and unresolved fields remain marked "NOT IN PROVIDED TEXT." Save the checked note before any reproduction output is opened.

The deliverable

The deliverable is a frozen scoping note that can be committed with the project. It lists the source study, estimand, reported estimates with outcome scales, independent reproduction criteria, methods-to-source map, source locations, and freeze date.

Provenance from our work

This step scoped the Supplemental Nutrition Assistance Program (SNAP) Broad-Based Categorical Eligibility (BBCE) case. The source study provides four reference values.¹

The two-way fixed-effects estimate is +5.9 percent.
The heterogeneity-robust estimate is +15.3 percent.
The eligibility-expansion channel is 11.5 percent.
The forbidden-comparison weight share is about 0.28.

The methods map traces staggered-adoption decomposition to Goodman-Bacon.² It links the group-time estimator to Callaway and Sant'Anna.³ It links pre-test caution to Roth.⁴ It links serial-correlation concerns in difference-in-differences inference to Bertrand, Duflo, and Mullainathan.⁵

The Too Early To Say article later reports +5.81 percent, but numerical proximity does not establish an independent reproduction. Read When the parallel-trends test fails on one lead, what's left?

Public-material status: article only. No matching public analysis script, frozen input manifest, run record, or saved output currently derives the article result from the frozen source-study criteria.

The open staggered-adoption methods package implements the estimator family named in the source map and is open to rerun. It does not reproduce the SNAP BBCE result.

References

Wang, X., Valizadeh, P., Nayga, R. M., Jr., Bryant, H. L., & Fischer, B. L. (2026). Broad-based categorical eligibility policy and SNAP participation. Journal of Policy Analysis and Management, 45(1), e70063. ↩
Goodman-Bacon, A. (2021). Difference-in-differences with variation in treatment timing. Journal of Econometrics, 225(2), 254-277. ↩
Callaway, B., & Sant'Anna, P. H. C. (2021). Difference-in-differences with multiple time periods. Journal of Econometrics, 225(2), 200-230. ↩
Roth, J. (2022). Pretest with caution: Event-study estimates after testing for parallel trends. American Economic Review: Insights, 4(3), 305-322. ↩
Bertrand, M., Duflo, E., & Mullainathan, S. (2004). How much should we trust differences-in-differences estimates? The Quarterly Journal of Economics, 119(1), 249-275. ↩