Rerandomization and Regression Adjustment
Seminar | August 22 | 4-5 p.m. | 1011 Evans Hall
Peng Ding, UC Berkeley
Randomization is a basis for the statistical inference of treatment effects without assumptions on the outcome generating process. Appropriately using covariates further yields more precise estimators in randomized experiments. In his seminal work Design of Experiments, R. A. Fisher suggested blocking on discrete covariates in the design stage and conducting the analysis of covariance (ANCOVA) in the analysis stage. In fact, we can embed blocking into a wider class of experimental design called rerandomization, and extend the classical ANCOVA to more general regression-adjusted estimators. Rerandomization trumps complete randomization in the design stage, and regression adjustment trumps the simple difference-in-means estimator in the analysis stage. We argue that practitioners should always consider using a combination of rerandomization and regression adjustment. Under the randomization-inference framework, we establish a unified theory allowing the designer and analyzer to have access to different sets of covariates. We find that asymptotically (a) for any given estimator with or without regression adjustment, using rerandomization will never hurt either the sampling precision or the estimated precision, and (b) for any given design with or without rerandomization, using our regression-adjusted estimator will never hurt the estimated precision. To theoretically quantify these statements, we propose two notions of optimal regression-adjusted estimators and measure the additional gains of the designer and analyzer based on the sampling precision and estimated precision. This is a joint work with Xinran Li.