A/B Testing for Recommender Systems in a Two-sided Marketplace
Authors: Preetam Nandy, Divya Venugopalan, Chun Lo, Shaunak Chatterjee
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We use simulations to validate our approach and compare it against existing methods. We also deployed Uni Co Rn in an edge recommendation application that serves tens of millions of members and billions of edge recommendations daily. |
| Researcher Affiliation | Industry | Preetam Nandy, Divya Venugopalan, Chun Lo, Shaunak Chatterjee Linked In Corporation Mountain View, CA 94083 {pnandy, dvenugopalan, chunlo, shchatterjee}@linkedin.com |
| Pseudocode | Yes | Algorithm 1 Uni Co Rn(P0, P1, ) |
| Open Source Code | Yes | Code is available in the supplementary material. |
| Open Datasets | No | The paper uses a simulated environment to generate data for evaluation, rather than a publicly available dataset with concrete access information. "For Sections 4.1 and 4.2, we create a simulated environment with L = 100 positions to generate data for the empirical evaluation of Uni Co Rn(P0, P1, )" |
| Dataset Splits | No | The paper describes generating data within a simulated environment with certain parameters (e.g., L=100 positions, NS=50000 sessions) but does not provide explicit train/test/validation splits of a fixed dataset, which is typical for empirical studies on pre-existing datasets. The simulated nature means these splits are not directly applicable. |
| Hardware Specification | Yes | Each iteration (based on 1000 sessions) including the data generation, reranking based on Uni Co Rn( ) for 2 {0, 0.2, 1}, Ha Thuc Et Al and OASIS, and the treatment effect estimation took 36 seconds on average on a Macbook Pro with 2.4 GHz 8-Core Intel Core i9 processor and 32 GB 2667 MHz DDR4 memory. |
| Software Dependencies | No | The paper states, "We implemented the Algorithms in R." and "The changes were implemented in Java in our distributed, real-time production serving system." However, no specific version numbers for R, Java, or any libraries/packages are provided. |
| Experiment Setup | Yes | For Sections 4.1 and 4.2, we create a simulated environment with L = 100 positions to generate data for the empirical evaluation of Uni Co Rn(P0, P1, ). First, we compare the design accuracy and the cost of the variants of Uni Co Rn( ) based on a number of values of . Next, we compare the performances of Uni Co Rn( ) for 2 {0, 0.2, 1}, the counterfactual ranking method of [3] (we will refer to this as Ha Thuc Et Al) and a modiļ¬ed version of OASIS [9] for estimating the average treatment effect. We consider two different settings, namely (i) 10% treatment and 90% control and (ii) 50% treatment and 50% control. |