Multiplier Bootstrap-based Exploration
Authors: Runzhe Wan, Haoyu Wei, Branislav Kveton, Rui Song
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | With extensive simulation and real-data experiments, we show the generality and adaptivity of MBE. |
| Researcher Affiliation | Collaboration | *Equal contribution 1Amazon 2Department of Economics, University of California San Diego 3Department of Statistics, North Carolina State University. Correspondence to: Runzhe Wan <runzhe.wan@gmail.com>, Rui Song <songray@gmail.com>. |
| Pseudocode | Yes | Algorithm 1: General Template for MBE |
| Open Source Code | No | The paper does not explicitly state that source code for the methodology is being released or provide a link to a code repository. |
| Open Datasets | Yes | We use the Yelp rating dataset (Zong et al., 2016) to recommend and rank K restaurants, use the Adult dataset (Dua & Graff, 2017) to send advertisements to K/2 men and K/2 women (a combinatorial semi-bandit problem with continuous rewards), and use the Movie Lens dataset (Harper & Konstan, 2015) to display K movies. |
| Dataset Splits | No | The paper mentions splitting datasets into training and testing sets but does not provide specific split percentages, absolute sample counts, or refer to predefined splits with citations for reproducibility. |
| Hardware Specification | No | The paper does not provide specific hardware details such as CPU/GPU models, memory, or cloud instance types used for running experiments. |
| Software Dependencies | No | The paper does not explicitly list software dependencies with version numbers. |
| Experiment Setup | Yes | In all experiments below, the weights of MBE are sampled from N(1, σ2 ω) 1. We fix λ = 0.5 and run MBE with three different values of σ2 ω: 0.5, 1 and 1.5. We also compare with the naive adaption of multiplier bootstrap (i.e., no pseudorewards; denoted as Naive MB). We run Algorithm 2 with B = 50 replicates. |