Machine Learning Estimation of Heterogeneous Treatment Effects with Instruments
Authors: Vasilis Syrgkanis, Victor Lei, Miruna Oprescu, Maggie Hei, Keith Battocchi, Greg Lewis
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We applied our method to estimate the effect of membership on downstream webpage engagement on Trip Advisor, using as an instrument an intent-to-treat A/B test among 4 million Trip Advisor users, where some users received an easier membership sign-up process. We also validate our method on synthetic data and on public datasets for the effects of schooling on income. |
| Researcher Affiliation | Industry | Vasilis Syrgkanis Microsoft Research vasy@microsoft.com Victor Lei Trip Advisor vlei@tripadvisor.com Miruna Oprescu Microsoft Research moprescu@microsoft.com Maggie Hei Microsoft Research Maggie.Hei@microsoft.com Keith Battocchi Microsoft Research kebatt@microsoft.com Greg Lewis Microsoft Research glewis@microsoft.com |
| Pseudocode | Yes | Algorithm 1: HETEROGENEOUS EFFECTS: DMLIV Partially orthogonal, convex loss. Algorithm 2: DRIV Orthogonal convex loss for CATE and projections of CATE |
| Open Source Code | Yes | Prototype code for all the algorithms presented and the synthetic data experimental study can be found at https://github.com/Microsoft/Econ ML/tree/master/prototypes/dml_iv. |
| Open Datasets | Yes | We also validate our method on synthetic data and on public datasets for the effects of schooling on income... We analyze Card s data from the Nat. Long. Survey of Young Men (NLSYM, 1966) to estimate the ATE of education on wages and find sources of heterogeneity. We describe the NLSYM data in depth in Appendix D. |
| Dataset Splits | Yes | Our data consists of 4,606,041 total users in a 50:50 A/B test... On a half-sample S1: regress i) Y on X, ii) T on X, Z, iii) T on X, to learn estimates ˆq, ˆh and ˆp corr.; 2 Minimize the empirical analogue of the square loss over some hypothesis space Θ on the other half-sample S2: or any learning algorithm that achieves small generalization error w.r.t. loss L1(θ; ˆq, ˆh, ˆp) over Θ. |
| Hardware Specification | Yes | We attempted to use the R implementation of Generalized Random Forests (GRF)[4] to compare with our results. However, we could not fit due to the size of the data and insufficient memory errors (with 64GB RAM). |
| Software Dependencies | No | The paper mentions software like 'LASSO regression', 'logistic regression', 'gradient boosting regression and classification (GB)', 'R implementation of Generalized Random Forests', and 'XGBoost [8]', but does not specify version numbers for these software components. |
| Experiment Setup | Yes | We applied two sets of nuisance estimation models with different complexity characteristics: LASSO regression and logistic regression with an L2 penalty (LM); and gradient boosting regression and classification (GB). The only exception was E[Z|X], where we used a fixed estimate of 0.5 since the instrument was a large randomized experiment. See Sec. B.1 for details. (Appendix B.1: For all nuisance models, we use default XGBoost parameters with 100 trees and early stopping on a 20% validation set with patience 20. ... For Lasso regressions, we use a ℓ1 regularization parameter α = 0.001.) |