Tutorial
7 articles
How to Build a Census Data Pipeline That Doesn't Silently Fail
A Python workflow for pulling ACS data from the Census API, including the validation checks that prevent bad data from reaching the analysis.
How to Estimate Difference-in-Differences in Python
A statsmodels workflow for event study estimation, with the diagnostics that separate credible estimates from noise.
How to Validate GTFS Feeds Before They Break the Routing Engine
A Python workflow for catching the transit data problems that structural checks miss. Six validation layers from download fallbacks to multi-agency smoke tests.
How to Build a Classifier When 94% Accuracy Means Nothing
A scikit-learn workflow for imbalanced classification, with the evaluation metrics that actually matter.
How to Interpret a Classifier with SHAP Values
A Python workflow for understanding what drives model predictions, and what SHAP importance actually measures.
Spatial Analysis with GeoPandas: From Joins to Autocorrelation
A spatial analysis workflow that starts with point-to-polygon joins and builds toward spatial weights, autocorrelation testing, and LISA cluster detection using Python.
r5py TravelTimeMatrixComputer: Compute 2.7M Transit Routes
A worked example using r5py and TravelTimeMatrixComputer to compute 2.7M transit travel times from GTFS and OpenStreetMap data, no API keys needed.