Medicaid

4 articles

What 227 Million Rows of Medicaid Data Can and Can't Tell Us

HHS published the largest Medicaid dataset in history. Here's what it contains, what's missing, and why the gap matters for fraud screening.

Feb 2026 · Methodology

The Label That Isn't: Why "Excluded" Doesn't Mean "Fraudulent"

40% of LEIE exclusions are license revocations, not fraud. Why the federal exclusion list is a contaminated label for detection.

Feb 2026 · Methodology

What Billing Patterns Actually Look Like

Excluded providers have lower billing than non-excluded ones. Label contamination, the autism false-positive problem, and proxy variable bias.

Feb 2026 · Methodology

Can a Classifier Find What Investigators Miss?

Logistic regression matches complex ML on honest validation. Model complexity adds noise, not signal.

Feb 2026 · Methodology