Proportional Response: Contextual Bandits for Simple and Cumulative Regret Minimization
Authors: Sanath Kumar Krishnamurthy, Ruohan Zhan, Susan Athey, Emma Brunskill
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To demonstrate the computational tractability of our approach, we ran a simulation on setting within a R2context space, eight arms, linear models, and an exploration horizon of 5000. Our algorithms ran in less than 9 seconds on a Macbook M1 Pro. We also compare with other baselines on simple/cumulative regret. See Appendix E.4 for details. |
| Researcher Affiliation | Academia | Sanath Kumar Krishnamurthy Management Science and Engineering Stanford University sanathsk@stanford.edu; Ruohan Zhan Industrial Engineering and Decision Analytics Hong Kong University of Science and Technology rhzhan@ust.hk; Susan Athey Graduate School of Business Stanford University athey@stanford.edu; Emma Brunskill Computer Science Department Stanford University ebrun@cs.stanford.edu |
| Pseudocode | Yes | Algorithm 1 ω Risk Adjusted Proportional Response (ω-RAPR) |
| Open Source Code | No | The paper does not include any explicit statements or links indicating that source code for the described methodology is publicly available. |
| Open Datasets | No | The paper mentions running a simulation ('We ran a simulation on setting within a R2context space, eight arms, linear models, and an exploration horizon of 5000.') but does not specify a publicly available dataset, nor does it provide any link, DOI, or citation for data access. |
| Dataset Splits | No | The paper states, 'Our algorithm splits Sm into three equally-sized subsets: Sm,1, Sm,2 and Sm,3.' This refers to internal data splitting within the algorithm's operation, not to explicit train/validation/test dataset splits for experimental reproduction. The 'Simulations' section does not provide any specific dataset split information. |
| Hardware Specification | Yes | Our algorithms ran in less than 9 seconds on a Macbook M1 Pro. |
| Software Dependencies | No | The paper describes conceptual components of its algorithm and refers to types of solvers (e.g., 'cost-sensitive classification (CSC) solver'), but it does not specify any particular software libraries, packages, or their version numbers that would be required for replication. |
| Experiment Setup | No | The paper states, 'We ran a simulation on setting within a R2context space, eight arms, linear models, and an exploration horizon of 5000.' While this gives some context, it does not provide specific hyperparameter values (e.g., learning rate, batch size, optimizer settings) or detailed training configurations needed for reproducible experimental setup. |