Tutorial

7 articles

How to Build a Census Data Pipeline That Doesn't Silently Fail

A Python workflow for pulling ACS data from the Census API, including the validation checks that prevent bad data from reaching the analysis.

Feb 2026 · Methodology

How to Estimate Difference-in-Differences in Python

A statsmodels workflow for event study estimation, with the diagnostics that separate credible estimates from noise.

Feb 2026 · Methodology

How to Validate GTFS Feeds Before They Break the Routing Engine

A Python workflow for catching the transit data problems that structural checks miss. Six validation layers from download fallbacks to multi-agency smoke tests.

Feb 2026 · Methodology

How to Build a Classifier When 94% Accuracy Means Nothing

A scikit-learn workflow for imbalanced classification, with the evaluation metrics that actually matter.

Feb 2026 · Methodology

How to Interpret a Classifier with SHAP Values

A Python workflow for understanding what drives model predictions, and what SHAP importance actually measures.

Feb 2026 · Methodology

Spatial Analysis with GeoPandas: From Joins to Autocorrelation

A spatial analysis workflow that starts with point-to-polygon joins and builds toward spatial weights, autocorrelation testing, and LISA cluster detection using Python.

Feb 2026 · Methodology

r5py TravelTimeMatrixComputer: Compute 2.7M Transit Routes

A worked example using r5py and TravelTimeMatrixComputer to compute 2.7M transit travel times from GTFS and OpenStreetMap data, no API keys needed.

Nov 2025 · Methodology