AI Methods
AI-assisted applied economics research. Each article ships with Python code and replication materials.
Build Your Own Tools: Tutorials
Articles that walk through building research tools, end-to-end. Each pairs with code in our GitHub or with live tools at Tools & Code.
Well-Executed But Not Important: Reading Importance From the Published Record
An LLM classification of 2,493 health-economics articles to operationalize importance. Calibration is 35% of publications but 18% of citations; Identification carries a +91% premium and Reframing +126%, holding topic, journal, and year constant. Pairs with the journal-topic-shares replication repo.
Cycling Through Bad Ideas Faster: A Medicaid Branding Worked Example
A two-week solo cycle through three coding rules, a controls ladder, and a behavioral-mechanism test on state Medicaid program branding, ending at a bounded null after expansion-cohort fixed effects collapse a naive +22% headline. Companion to the meta-research piece above.
Robustness checks for Medicaid DiD after the 2023 Unwinding
A three-check diagnostic protocol for state Medicaid panels, with a clean replication dataset and worked Callaway-Sant'Anna code.
Claude Code Skills Get Stale. Audit Them Quarterly.
A repeatable audit so the skills, hooks, and memory entries we wrote for older models stop quietly shaping today's numbers.
A Pre-Analysis Plan for Your Coding Agent
A three-layer architecture, rule, gate, and verification, for keeping a reasoning agent disciplined when system prompts alone are not enough.
Building a Literature Surveillance System
Combining free tools (Google Scholar, Semantic Scholar) with an AI assistant that handles the glue: citation networks, source merging, and the quiet failures.
One Context File, Zero Re-Explanations
How we set up a CLAUDE.md context file so research context survives across sessions, and we stop re-explaining the same project every time.
From Methods Paragraph to Working Pipeline
Translating a methodology section into executable code with AI assistance, step by step.
47 Scripts to 15: Cleaning a Research Codebase
Using an AI assistant to refactor and consolidate a sprawling research codebase without losing the analytical thread.
6,613 Stores, $147, Zero Lost Data
Building resilient data pipelines that handle API failures, rate limits, and edge cases without losing rows.
400 Labels to 94% Accuracy
Building and validating a grocery store classifier through iterative labeling, with the loop documented end-to-end.
EBT Verification Methodology
Cross-validating SNAP retailer data against multiple authoritative sources, so the labels we trust have a paper trail.
How to Calculate 2.7M Transit Routes for Free
Step-by-step guide to r5py, GTFS data, and multimodal accessibility analysis at zero cost.
Most Recent
Well-Executed But Not Important: Reading Importance From the Published Record
When AI thins out the technical-flaws desk-rejection pretext, editors will have to learn to say "well-executed but not important" on the record. We classify 2,493 articles across four health-economics journals to ask what "important" has actually meant.
Cycling Through Bad Ideas Faster: A Medicaid Branding Worked Example
What AI actually adds to solo research is fast iteration through ideas that turn out to be wrong, with new techniques sometimes emerging as byproducts of the failed attempts.
Claude Code Skills Get Stale. Audit Them Quarterly.
Every skill, hook, and memory entry written for an older model is a patch with an expiration date. In empirical research, the expired ones produce wrong numbers that look right and shape the policy decisions built on them.
What AI Impact Looks Like in the Slow Data
Usage telemetry sees AI adoption; slow public data sees household conditions. The same AI tooling can read both at the cadence either one needs.
A Pre-Analysis Plan for Your Coding Agent
A three-layer architecture for keeping reasoning agents disciplined: rule, gate, and verification. Trained priors beat system prompts, so reliable behavior redirection needs architecture, not instruction.
Building a Literature Surveillance System
Free tools like Google Scholar alerts and Semantic Scholar already monitor academic literature. What an AI coding assistant adds is the glue: combining sources, following citation networks, and catching the quiet failures that make AI-gathered references dangerous.
Browse all 53 methodology articles by category below.
Medicaid Fraud Detection
What 227 Million Rows of Medicaid Data Can and Can't Tell Us
The largest Medicaid dataset in history just went public. What it contains, what's missing, and why that matters for fraud screening.
The Label Problem: Why Fraud Labels Are Harder Than They Look
Exclusion lists are the closest thing we have to fraud labels. They are further from ground truth than most analysts assume.
What Billing Patterns Actually Look Like
Comparing excluded and non-excluded providers across billing volume, coding concentration, and temporal patterns.
Can a Classifier Find What Simpler Methods Miss?
Building a supervised fraud classifier with gradient boosting, SHAP interpretation, and honest temporal validation.
AI-Assisted Research
Well-Executed But Not Important: Reading Importance From the Published Record
An LLM classification of 2,493 health-economics articles to operationalize importance. Calibration is 35% of publications but 18% of citations; Identification carries a +91% premium and Reframing +126%, holding topic, journal, and year constant.
Cycling Through Bad Ideas Faster: A Medicaid Branding Worked Example
A two-week solo cycle through three coding rules, a controls ladder, and a behavioral-mechanism test, ending at a null. What AI compresses is the calendar time of discarding bad ideas.
One Context File, Zero Re-Explanations
How CLAUDE.md files maintain research context across sessions, eliminating repetitive explanations.
From Methods Paragraph to Working Pipeline
Translating a methodology section into executable code with AI assistance.
47 Scripts to 15: Cleaning a Research Codebase
Using AI to refactor and consolidate a sprawling research codebase.
Data Collection & Validation
6,613 Stores, $147, Zero Lost Data
Building resilient data pipelines that handle API failures, rate limits, and edge cases.
400 Labels to 94% Accuracy
Building and validating a grocery store classifier through iterative labeling.
EBT Verification Methodology
Cross-validating SNAP retailer data against multiple authoritative sources.
Spatial Analysis
Frequently Asked Questions
What is AI-assisted research?
AI-assisted research uses large language models like Claude to accelerate the translation of methodological expertise into working code. The researcher provides domain knowledge, variable definitions, and methodological decisions through context files (CLAUDE.md). The AI helps implement these ideas as code, identifies edge cases, and assists with refactoring. AI assistance doesn't replace expertise; it multiplies its impact. See our article on context files in research.
How do you calculate transit accessibility for free?
We use r5py, a Python library built on Conveyal's R5 routing engine. Combined with publicly available GTFS transit feeds, it can calculate millions of multimodal routes at zero cost. Our r5py tutorial walks through the complete process with working code examples.
How do you validate data quality?
We cross-validate against multiple authoritative sources. For grocery store data, we compared USDA Food Access Atlas listings against the official SNAP retailer database, California ABC license records, and manual verification. This iterative process, documented in our grocery store classifier article, achieved 94% accuracy through 400 hand-labeled examples.
Can I replicate your research?
Yes. Every article links to a public GitHub repository containing all data and code needed to reproduce the analysis. Our main replication repository contains 18 research projects with complete documentation.