Staying current with academic literature is one of those tasks that is simple in theory and tedious in practice. Open SSRN. Search. Scroll. Open NBER. Search. Scroll. Open Google Scholar. Search. Scroll. Note anything interesting. Repeat next week. The searches take maybe 30 minutes when done properly, which means they often get abbreviated or skipped entirely when deadlines loom.
This case study walks through building a Claude Code skill called /ca-lit that automates literature surveillance across three sources: SSRN, NBER, and Google Scholar. The goal is turning a manual weekly chore into a single command.
What We Want
The basic use case: search for recent papers on a topic across multiple academic sources, deduplicate the results, and present them in a readable format. Optionally, save to a file for later reference.
So the command might look like:
/ca-lit "Medicaid fraud"
And the output would show papers from each source, grouped and deduplicated.
The Three Sources
Each source has different strengths and access patterns.
SSRN (Social Science Research Network)
SSRN hosts working papers and preprints. Papers appear here before journal publication, sometimes years before. For health economics research, this is where early-stage work shows up.
The access pattern: SSRN has a search interface at ssrn.com/search. We can navigate there, enter search terms, and extract results from the page.
NBER (National Bureau of Economic Research)
NBER publishes economics working papers, including a dedicated Health Economics program. These are typically polished drafts from established researchers, often appearing 6-18 months before journal publication.
The access pattern: NBER has an API at nber.org/api/v1/working_papers/search. We can query directly without scraping.
Google Scholar
Scholar indexes published papers with a lag. It catches work that has made it through peer review and into journals.
The access pattern: Scholar is notoriously hostile to automation. It requires a visible browser window and frequently presents CAPTCHAs. Any automated access needs to be slow and careful.
The Skill Structure
A Claude Code skill lives in ~/.claude/skills/{skill-name}/SKILL.md. The markdown file defines the trigger, purpose, and workflow.
Here is the structure for /ca-lit:
~/.claude/skills/ca-lit/
โโโ SKILL.md
The skill file specifies:
- Trigger: What command invokes the skill (
/ca-lit) - Purpose: One-line description
- Options: What flags modify behavior
- Workflow: Step-by-step phases
- Output: Where results go
Walking Through the Options
Let's look at each option and what it does.
Basic Search
/ca-lit "Medicaid fraud"
This searches all three sources for papers matching "Medicaid fraud" in the title or abstract. Results appear in the console, grouped by source.
The workflow:
- Query NBER API for matching papers
- Navigate to SSRN search, enter terms, extract results
- Navigate to Google Scholar, enter terms, extract results
- Deduplicate across sources (same paper may appear in multiple places)
- Display results
Single Source
/ca-lit --source nber "health economics"
The --source flag limits the search to one platform. Options are nber, ssrn, or scholar.
Why use this? Google Scholar is slow and finicky. If we just want to check NBER for new working papers, there is no reason to wait for Scholar to load. Also useful when Scholar is blocking automated access entirely.
Date Filter
/ca-lit --since 2025-01-01 "county health spending"
The --since flag filters to papers published after a date. Useful for checking what has appeared since the last search rather than getting all-time results.
Digest Output
/ca-lit --digest "public health funding"
The --digest flag writes results to a markdown file instead of just console output. The digest format includes more detail than console output: full abstracts, direct URLs, and a summary section. Useful for creating a record of what was found.
Save to Corpus
/ca-lit --save "Medi-Cal enrollment"
The --save flag appends results to a JSON corpus file. This builds up a searchable database of papers found over time. The corpus can feed into other skills for deeper analysis later.
Combining Options
Options combine naturally:
/ca-lit --source nber --since 2025-01-01 --digest "health economics"
This searches only NBER, only papers since January 1, and writes results to a digest file.
The Google Scholar Problem
Scholar deserves special mention because it actively fights automation. The skill handles this by:
- Headed mode: Running a visible browser window rather than headless. Scholar detects headless browsers.
- Delays: Waiting 60+ seconds between requests. Faster access triggers blocking.
- Result limits: Only pulling 20 results per search. Pagination invites detection.
- Graceful degradation: If Scholar fails, the skill continues with SSRN and NBER only.
Sometimes Scholar will present a CAPTCHA. When this happens, we solve it manually and the skill continues. This is annoying but rare if the delays are respected.
If Scholar consistently fails, the --source flag lets us skip it entirely:
/ca-lit --source nber "health economics"
/ca-lit --source ssrn "health economics"
NBER and SSRN together catch most working papers before they reach Scholar anyway.
Deduplication
The same paper often appears on multiple platforms. An NBER working paper might also be posted on SSRN and indexed by Scholar. Without deduplication, results would show the same paper three times.
The skill deduplicates based on:
- Title similarity: Fuzzy matching handles minor variations ("The Effect of..." vs "Effect of...")
- Author + year: If authors and publication year match, likely the same paper
After deduplication, the output shows unique papers only, with a note about which sources had it.
Example Session
Here is what an actual invocation looks like:
User: /ca-lit "Medicaid fraud" --digest
Claude: Searching for "Medicaid fraud" across SSRN, NBER, Google Scholar...
[NBER] Querying API...
Found: 3 papers
[SSRN] Searching...
Found: 7 papers
[Google Scholar] Searching (headed mode)...
Found: 15 papers
Deduplicating... 22 unique papers
## Summary
- NBER: 3 working papers
- SSRN: 7 preprints
- Google Scholar: 12 published papers
Digest saved: ~/Projects/CAPHE/08_research/literature/weekly_digests/digest_2026-02-01.md
The digest file contains full details for each paper.
What This Replaces
The manual workflow:
- Open SSRN, search, scroll through results, note interesting papers
- Open NBER, browse recent working papers, note interesting ones
- Open Scholar, search, scroll, note papers
- Cross-reference notes to remove duplicates
- Save somewhere for later
Time: 30-60 minutes when done thoroughly. Often skipped.
The automated workflow:
- Run
/ca-lit "search terms" --digest - Review the generated digest
Time: 5 minutes to review. Runs consistently.
The value is reliability, not sophistication. The searches happen every week regardless of deadline pressure.
The Skill Reference
For reference, here is the complete skill definition:
Location: ~/.claude/skills/ca-lit/SKILL.md
Trigger: /ca-lit
| Option | Purpose |
|---|---|
--source <name> |
Search one source only (nber, ssrn, scholar) |
--since <date> |
Papers since date (YYYY-MM-DD) |
--digest |
Write results to markdown digest file |
--save |
Append results to JSON corpus |
Conclusion
Building a literature surveillance skill required defining three things: what sources to search, how to access each one, and where to put results. The implementation handles the tedious parts (navigating sites, extracting metadata, deduplicating) while leaving judgment calls (which papers matter) to humans.
The command /ca-lit "search terms" now does in seconds what used to take half an hour. More importantly, it happens consistently rather than being skipped when time is short.
Suggested Citation
Cholette, V. (2026, February 1). Building a literature surveillance skill. Too Early To Say. https://tooearlytosay.com/research/methodology/literature-surveillance-skill/Copy citation