Towards honest inference from real-world healthcare data

Seminar | March 6 | 4-5 p.m. | 1011 Evans Hall

 David Madigan, Columbia University

 Department of Statistics

In practice, our learning healthcare system relies primarily on observational studies generating
one effect estimate at a time using customized study designs with unknown operating
characteristics and publishing – or not – one estimate at a time. When we investigate
the distribution of estimates that this process has produced, we see clear evidence
of its shortcomings, including an apparent over-abundance of statistically significant effects.
We propose a standardized process for performing observational research that
can be evaluated, calibrated and applied at scale to generate a more reliable and complete
evidence base than previously possible. We demonstrate this new paradigm by generating
evidence about all pairwise comparisons of 39 treatments for hypertension for a relevant
set of 58 health outcomes using nine large-scale health record databases from four countries.
In total, we estimate 1.3M hazard ratios, each using a comparative effectiveness study
design and propensity score stratification on par with current one-off observational studies
in the literature. Moreover, the process enables us to employ negative and positive controls
to evaluate and calibrate estimates ensuring, for example, that the 95% confidence
interval includes the true effect size 95% of time. The result set consistently reflects
current established knowledge where known, and its distribution shows no evidence
of the faults of the current process.