reproducibilityindex.ai

Machine Learning Estimation of Heterogeneous Treatment Effects with Instruments

Authors: Vasilis Syrgkanis, Victor Lei, Miruna Oprescu, Maggie Hei, Keith Battocchi, Greg Lewis

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We applied our method to estimate the effect of membership on downstream webpage engagement on Trip Advisor, using as an instrument an intent-to-treat A/B test among 4 million Trip Advisor users, where some users received an easier membership sign-up process. We also validate our method on synthetic data and on public datasets for the effects of schooling on income.
Researcher Affiliation	Industry	Vasilis Syrgkanis Microsoft Research vasy@microsoft.com Victor Lei Trip Advisor vlei@tripadvisor.com Miruna Oprescu Microsoft Research moprescu@microsoft.com Maggie Hei Microsoft Research Maggie.Hei@microsoft.com Keith Battocchi Microsoft Research kebatt@microsoft.com Greg Lewis Microsoft Research glewis@microsoft.com
Pseudocode	Yes	Algorithm 1: HETEROGENEOUS EFFECTS: DMLIV Partially orthogonal, convex loss. Algorithm 2: DRIV Orthogonal convex loss for CATE and projections of CATE
Open Source Code	Yes	Prototype code for all the algorithms presented and the synthetic data experimental study can be found at https://github.com/Microsoft/Econ ML/tree/master/prototypes/dml_iv.
Open Datasets	Yes	We also validate our method on synthetic data and on public datasets for the effects of schooling on income... We analyze Card s data from the Nat. Long. Survey of Young Men (NLSYM, 1966) to estimate the ATE of education on wages and ﬁnd sources of heterogeneity. We describe the NLSYM data in depth in Appendix D.
Dataset Splits	Yes	Our data consists of 4,606,041 total users in a 50:50 A/B test... On a half-sample S1: regress i) Y on X, ii) T on X, Z, iii) T on X, to learn estimates ˆq, ˆh and ˆp corr.; 2 Minimize the empirical analogue of the square loss over some hypothesis space Θ on the other half-sample S2: or any learning algorithm that achieves small generalization error w.r.t. loss L1(θ; ˆq, ˆh, ˆp) over Θ.
Hardware Specification	Yes	We attempted to use the R implementation of Generalized Random Forests (GRF)[4] to compare with our results. However, we could not ﬁt due to the size of the data and insufﬁcient memory errors (with 64GB RAM).
Software Dependencies	No	The paper mentions software like 'LASSO regression', 'logistic regression', 'gradient boosting regression and classiﬁcation (GB)', 'R implementation of Generalized Random Forests', and 'XGBoost [8]', but does not specify version numbers for these software components.
Experiment Setup	Yes	We applied two sets of nuisance estimation models with different complexity characteristics: LASSO regression and logistic regression with an L2 penalty (LM); and gradient boosting regression and classiﬁcation (GB). The only exception was E[Z\|X], where we used a ﬁxed estimate of 0.5 since the instrument was a large randomized experiment. See Sec. B.1 for details. (Appendix B.1: For all nuisance models, we use default XGBoost parameters with 100 trees and early stopping on a 20% validation set with patience 20. ... For Lasso regressions, we use a ℓ1 regularization parameter α = 0.001.)