# Literature Surveillance Prompt + Three-Tier Verification Protocol

**Source article:** [Building a Literature Surveillance System](https://tooearlytosay.com/research/methodology/literature-surveillance-skill/)
**Pattern:** Multi-source scan, then verify in three tiers.

A weekly literature scan plus a verification funnel for any citation that makes it into a manuscript. The scan consolidates SSRN, NBER, and Semantic Scholar into one pass. The verification catches the three failure modes we have seen in practice: hallucinated papers, misattributed claims, and author swaps.

---

## Part 1 — Weekly surveillance prompt

Run this as a scheduled task. Replace placeholders with our actual seeds, queries, and output path.

```
Run a weekly literature surveillance pass for <your-research-area>.

Sources to query (in this order):
  1. SSRN search (browser automation): query terms = [<term 1>, <term 2>, ...]
  2. NBER weekly metadata file (CSV at nber.org/research/data):
     filter by date >= <last run date> AND title contains any of [<term 1>, <term 2>, ...]
  3. Semantic Scholar API: same query terms; filter by year = <current year>

Citation network expansion:
  - Seed papers: <DOI 1>, <DOI 2>, <DOI 3>
  - Backward: extract reference lists from each seed.
  - Forward: pull papers citing each seed via Semantic Scholar's citation API.
  - Cap expansion at two hops from each seed.

Deduplication:
  - Match on title similarity (>= 0.9 Jaccard on title tokens) OR
    on (first author last name + year + first three title words).
  - For each unique paper, list which sources surfaced it.

Output:
  - Markdown digest at <your-output-path>/digest-<YYYY-MM-DD>.md
  - One row per paper: title, authors, year, source(s), DOI if known, one-line abstract summary.
  - Sort by source-count descending (papers found in multiple sources first).

Do NOT generate citations from memory. Every entry must come from a source response above.
If a source returns nothing for a query, say so explicitly. Do not paper over gaps.
```

---

## Part 2 — Three-tier verification

Every AI-surfaced citation is unverified until checked. The three tiers form a funnel: cheap automated checks at the top, expensive manual checks at the bottom where the highest-stakes errors hide.

### Tier 1: Metadata verification (automated)

Catches hallucinated papers, phantom DOIs, and author swaps.

```
For each citation in <file>, resolve the DOI via CrossRef. Compare the
returned title, authors, year, and journal against what the citation
claims. Flag any mismatch. For citations without DOIs, search CrossRef
by title and first author.

Report format: one row per citation with columns
  [citation_id, doi_resolves, title_match, authors_match, year_match, journal_match, flag_reason]
```

### Tier 2: Claim verification (semi-automated)

Catches misattributed numbers and findings.

```
For each citation that includes a specific numerical claim (effect
sizes, percentages, sample sizes, AUC values, p-values, dollar amounts),
fetch the abstract from PubMed or the publisher. Search for the specific
numbers or findings referenced in the surrounding text. If the claim
does not appear in the abstract, flag for manual full-text review.

Report format: one row per claim with columns
  [citation_id, claim_text, abstract_url, claim_in_abstract, flag_reason]
```

Abstracts do not contain every finding, so a missing match does not prove the claim is wrong. It flags claims that need a closer look.

### Tier 3: Full-text verification (manual)

For claims central to the argument that could not be verified from the abstract, read the relevant section of the actual paper. This is where mischaracterized conclusions and fabricated specifics hide. The easier it becomes to gather citations, the more important Tier 3 becomes.

Suggested workflow:

1. Pull the Tier 2 flag list.
2. For each flagged claim, fetch the full text (institutional access, ResearchGate, author website).
3. Locate the relevant section using the claim's keywords.
4. Mark the claim as VERIFIED, REVISE (specify the corrected language), or REMOVE.
5. Log the verification source (page number, section header, supplementary appendix).

---

## How to use this

- Save Part 1 as `literature-surveillance.prompt.md` in `<your-project>/.skills/` or wherever scheduled prompts live in our setup.
- Save Tier 1 and Tier 2 prompts in `<your-project>/.verification/`. Run them before any manuscript draft leaves the project folder.
- Keep Tier 3 as a discipline, not a script. The full-text review is irreducibly human.

---

*Template extracted from work published on [Too Early To Say](https://tooearlytosay.com).*
*Licensed MIT. Victoria Cholette, 2026.*
