Machine Learning for Variance Reduction in Online Experiments
Authors: Yongyi Guo, Dominic Coey, Mikael Konutgan, Wenting Li, Chris Schoener, Matt Goldman
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 3 Simulations & empirical results. We now validate MLRATE in practice, on both simulated data, and real Facebook user data. |
| Researcher Affiliation | Collaboration | Yongyi Guo Department of Operations Research and Financial Engineering Princeton University Princeton, NJ 08544 yongyig@princeton.edu Dominic Coey Facebook 1 Hacker Way, Menlo Park, CA 94025 coey@fb.com Mikael Konutgan Facebook 1 Hacker Way, Menlo Park, CA 94025 kmikael@fb.com Wenting Li Facebook 1 Hacker Way, Menlo Park, CA 94025 wentingli@fb.com Chris Schoener Facebook 1 Hacker Way, Menlo Park, CA 94025 chrissc@fb.com Matt Goldman Facebook 1 Hacker Way, Menlo Park, CA 94025 mattgoldman@fb.com |
| Pseudocode | Yes | Algorithm 1: Estimation and inference with MLRATE |
| Open Source Code | No | The paper does not provide any information about the availability of open-source code for the described methodology. |
| Open Datasets | No | We evaluate the estimator on 48 real metrics used in online experiments run by Facebook, capturing a broad range of the most commonly consulted user engagement and app performance measurements. |
| Dataset Splits | No | The paper specifies a cross-fitting strategy with K splits (e.g., K=2), but it does not describe conventional train/validation/test splits for the overall experimental evaluation or for the ML models themselves beyond the cross-fitting mechanism. |
| Hardware Specification | No | All computation is done on an internal cluster, on a standard 64GB ram machine. |
| Software Dependencies | No | Both in these simulations and the subsequent analysis of Facebook data, we choose gradient boosted regression trees (GBDT) and elastic net regression as two examples of ML prediction procedures in MLRATE, with scikit-learn’s implementation [28]. |
| Experiment Setup | Yes | Moreover, we choose K = 2 splits for cross-fitting. |