A well-written methodology section is almost executable code. The gap between describing a procedure and implementing it has narrowed to the point where the description itself can drive implementation.
Let's consider what makes this possible, and what still requires human judgment.
The Distance Between Description and Code
Traditional research methods sections describe procedures that someone else must translate into working code. The translation requires understanding domain knowledge, technical requirements, and implicit assumptions that the methods paragraph doesn't state explicitly.
Agent-based coding tools change this equation. If we give Claude Code a methods paragraph with sufficient specificity, it can implement the described procedure directly. The key word is "sufficient."
What "AI-Ready" Methodology Looks Like
We can examine a methods paragraph from the food security analysis to see what makes implementation straightforward:
"We calculated transit times from census tract centroids to the nearest grocery stores using r5py with OpenStreetMap pedestrian networks and GTFS transit schedules from VTA. For each of 408 census tract centroids, we computed travel time to the closest store among 4,847 validated grocery locations. Travel times represent door-to-door duration including walking to transit, waiting, riding, transferring, and walking from the final stop to the destination. We used Tuesday departure times between 9 AM and 11 AM to capture typical mid-morning shopping trips."
This paragraph contains six implementation-critical elements:
- Specific tool named: r5py (not "a routing engine")
- Data sources identified: OpenStreetMap, VTA GTFS, validated store file
- Parameters stated: 408 origins, 4,847 destinations, Tuesday 9-11 AM
- Processing logic defined: For each origin, find minimum time to any destination
- Travel components specified: Walking, waiting, riding, transferring
- Output implied: A 408-row table of minimum transit times
If we hand this paragraph to Claude Code with access to the data files, implementation is direct. The agent knows:
- Install r5py
- Load the OSM network and GTFS feed
- Read census tract centroids and store locations
- Configure routing for Tuesday 9-11 AM window
- Loop through 408 origins
- Find minimum time across 4,847 destinations for each origin
- Save results
The methods paragraph functions as a specification document.
Contrast: What Vague Methodology Requires
We can compare this to a less specific version:
"We assessed food access by calculating transit-based travel times to grocery stores using standard routing methods and publicly available transit data."
This describes the same analysis, but lacks implementation detail. An agent (or human programmer) must make numerous decisions:
- Which routing tool? (Google API? OSRM? r5py?)
- Which transit data? (Which agency? Current or historical?)
- Travel time from where? (Centroids? Population-weighted centers? Random points?)
- To which destinations? (All stores? Nearest store? Stores within threshold?)
- What time of day? (Peak? Off-peak? Average across day?)
Each decision changes results. Without specification, the agent makes reasonable guesses. Reasonable guesses produce code that runs but may not match research intent.
Implementation Example
Here's what happens when we give Claude Code the specific methods paragraph:
Human instruction: "Implement the transit time calculation described in the methods section. The validated store file is at data/processed/stores_validated.csv and census tract centroids are in data/census/tract_centroids.csv."
Agent process:
- Reads the methods paragraph from the research document
- Identifies required inputs: OSM network, GTFS feed, centroids, stores
- Checks for these files in the project directory
- Finds that OSM and GTFS files don't exist locally
- Asks: "I don't see the OSM network or VTA GTFS feed. Should I download them, or do you have them elsewhere?"
- (Human responds: "Download them")
- Writes script to download OSM Bay Area extract from Geofabrik
- Downloads VTA GTFS from their public feed
- Writes transit time calculation script matching specifications:
- Loads r5py
- Builds transport network from OSM + GTFS
- Reads 408 centroids and 4,847 stores
- Configures routing for Tuesday 9-11 AM
- Calculates minimum transit time from each centroid
- Saves results to
data/processed/transit_times.csv
- Runs the script
- Observes output: 408 rows as expected
- Reports completion
Total elapsed time: approximately 45 minutes (most of that is network building, which r5py does once and caches).
What Traditional Implementation Required
Before agent-based tools, implementing this same procedure meant:
Day 1 (4 hours):
- Research transit routing options
- Discover r5py documentation
- Understand GTFS data structure
- Figure out how to download OSM extracts
- Set up Java environment for r5py
- Debug installation issues
Day 2 (3 hours):
- Write data loading code
- Debug coordinate system mismatches
- Handle GTFS file format variations
- Write routing loop
- Debug memory issues with 408 x 4,847 calculations
Day 3 (2 hours):
- Validate output against manual spot-checks
- Fix off-by-one errors in minimum calculation
- Add progress logging
- Document the code
Total: approximately 9 hours for someone familiar with Python. Longer for someone learning the tools.
The agent compresses this to 45 minutes of mostly automated work, plus 10 minutes of human time to provide the instruction and verify results.
The Division of Labor
Implementation automation does not eliminate research judgment. Three categories of decisions remain distinctly human:
1. Research Design Decisions
The methods paragraph specifies "Tuesday 9-11 AM" for departure times. This choice reflects a research question: What does mid-morning shopping access look like?
Alternative valid choices:
- Peak commute hours (6-9 AM, 4-7 PM)
- Weekend schedules
- Average across all times
- Minimum across day (best-case access)
Each choice answers a different question. The agent can implement any of them efficiently. It cannot determine which question matters for the research.
2. Interpretation of Results
The agent produces a file with 408 transit times. It can calculate summary statistics (mean: 23.4 minutes, median: 18.7 minutes, max: 67.2 minutes). It can identify tracts with longest times.
What it does not determine: Whether 67 minutes constitutes a meaningful barrier. Whether the 3.6x difference between fastest and slowest access is policy-relevant. Whether these times suggest that improving transit frequency would reduce food insecurity.
These interpretations require domain knowledge about:
- How long people will travel for groceries
- Whether transit times compete with car times or walking times
- What other barriers exist (cost, cultural appropriateness of stores)
- Whether the spatial pattern suggests actionable interventions
3. Validation Strategy
The agent produces 408 numbers. It does not know whether they're reasonable without human guidance.
Validation requires:
- Spot-checking specific tracts against Google Maps (do the times seem right?)
- Comparing to prior studies (are these consistent with known patterns?)
- Checking edge cases (why does tract X show 67 minutes? Is that real or an error?)
- Verifying that routing parameters match real behavior (do people actually make 2-transfer trips?)
We can ask the agent to perform specific validation tasks ("spot-check the 5 longest transit times against Google Maps"), but designing the validation strategy is human work.
Time Comparison
| Task | Traditional (hours) | AI-Assisted (hours) | Reduction |
|---|---|---|---|
| Tool research and setup | 4.0 | 0.2 | 95% |
| Data acquisition | 1.0 | 0.0 | 100% |
| Code implementation | 3.0 | 0.2 | 93% |
| Debugging and validation | 2.0 | 0.3 | 85% |
| Documentation | 0.5 | 0.0 | 100% |
| Total | 10.5 | 0.7 | 93% |
The reduction concentrates in setup, implementation, and documentation. Validation time decreases less dramatically because verification still requires human judgment about what constitutes "correct."
The Broader Principle
The transit time example generalizes. If we write methods sections with implementation-level specificity:
Data collection:
"We collected store locations using Google Places API, querying for 'grocery store' within a 50km radius of each county centroid, retrieving name, address, coordinates, rating, review count, and place types for each result."
Classification:
"We trained an XGBoost classifier on 400 manually labeled locations (200 grocery stores, 200 non-grocery). Features included: Google place types (binary indicators), name substring matches (market, foods, liquor, gas, 7-Eleven), log review count, and rating."
Index construction:
"We constructed vulnerability scores as: 0.3x(poverty rate) + 0.3x(SNAP rate) + 0.2x(vehicle access) + 0.2x(median transit time), min-max normalizing each component to [0,1] before weighting."
Each description contains enough detail that implementation becomes straightforward. The agent knows what API to use, what parameters to query, what features to extract, what algorithm to train, how to weight components.
What This Changes About Research Workflow
When implementation costs approach zero, research iteration patterns change.
Traditional workflow:
- Design analysis
- Implement (days to weeks)
- Run analysis
- Discover limitation or alternative approach
- Decide if re-implementation worth the time cost
- Often: stick with initial approach because re-coding is expensive
Agent-assisted workflow:
- Design analysis
- Implement (minutes to hours)
- Run analysis
- Discover limitation or alternative approach
- Re-implement immediately
- Compare approaches
- Iterate until satisfied
The key difference: Step 5. When re-implementation is cheap, we can actually test the alternative approaches we think of. Robustness checks stop being theoretical ("we could try X") and become practical ("let's run it both ways").
What Still Requires Human Expertise
Three categories of work remain irreducibly human:
- Deciding what to measure: Transit times to nearest grocery store vs. transit times to high-quality grocery store vs. number of stores within 30 minutes. Each captures something different. The agent cannot determine which matters.
- Interpreting what results mean: A 3.6x difference in transit times is a number. Whether it represents a meaningful barrier to food access requires understanding of shopping behavior, household constraints, and existing literature.
- Designing the research question: The entire food security analysis starts from asking whether geographic access or economic access drives food insecurity more strongly. The agent cannot formulate this question. It can help answer it once asked.
Summary
A well-specified methods section becomes a direct implementation guide for agent-based coding tools. The closer we get to implementation-level detail in research documentation, the more smoothly agents can translate description to code.
This creates a positive feedback loop: Writing precise methods sections makes implementation faster. Fast implementation makes testing alternative approaches practical. Testing alternatives improves research quality. Higher quality research requires clearer documentation.
The limiting factor shifts from "how long will this take to code?" to "what exactly should we measure?" That's the question that always should have dominated research workflow.
About This Series:
This series explores practical applications of agent-based coding in applied economics research: reducing copy-paste iteration cycles, validating large datasets with minimal training data, building robust API collection pipelines, reorganizing research codebases, integrating AI into writing workflows, and bridging methodology documentation with implementation.
The tools compress implementation time substantially. They do not replace research judgment, interpretation, or design. Used well, they shift researcher time from debugging coordinate systems toward thinking about what to measure and what it means.
AI Disclosure: This article was written with AI assistance using Claude Code. Approximately 35% of text was AI-generated (primarily structure and routine explanations). All examples, research insights, and time comparisons come from actual project experience. Final editing and voice remain human.
Code Availability: Full implementation code for the food security analysis, including transit time calculation, grocery store classification, and vulnerability index construction, is available by request. Contact us at [email protected].
Next in series: 7 Copy-Paste Cycles to 1 Command shows what changes when AI can read your entire codebase.
How to Cite This Research
Too Early To Say. "From Methods Paragraph to Working Pipeline: AI-Assisted Implementation." October 2025. https://www.tooearlytosay.com/research/methodology/methodology-to-code-ai/Copy citation