Short-lived High-volume Bandits
Authors: Su Jia, Nishant Oli, Ian Anderson, Paul Duff, Andrew A Li, R. Ravi
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We further validate the effectiveness of our policy through a large-scale field experiment on Glance, a content card-serving platform. |
| Researcher Affiliation | Collaboration | 1Center of Data Science for Enterprise and Society (CDSES), Cornell University, Ithaca, USA 2Glance, Bangalore, India 3Tepper School of Business, Carnegie Mellon University, Pittsburgh, USA. |
| Pseudocode | Yes | Algorithm 1 Batched Successive Elimination Policy BSE(ε0, . . . , εℓ 1; k ) for Batched Bandits. |
| Open Source Code | No | The paper describes implementation of its policy in a field experiment, but does not provide any specific links or explicit statements about the release of its source code. |
| Open Datasets | No | The paper mentions analyzing 'user interaction data' from 'Glance, a leading lock-screen content platform' and using 'past data' for approximation, but it does not provide specific access information (link, DOI, or formal citation) for any public dataset. |
| Dataset Splits | No | The paper describes a field experiment and mentions a DNN recommender, but it does not provide specific details on training, validation, and test dataset splits for reproducibility. |
| Hardware Specification | No | The paper mentions implementing its policy 'on their real system' and refers to a 'content card-serving platform', but it does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used for the experiments. |
| Software Dependencies | No | The paper mentions using a 'Deep Neural Network (DNN)' and describes algorithms like 'Thompson Sampling' and 'Beta-Bernoulli reward model', but it does not list any specific software dependencies with version numbers. |
| Experiment Setup | Yes | Using an offline semi-synthetic simulation, we determined the empirically optimal parameter to be around ε0 = 0.2, which we used in the field experiment. |