Contextual Bandits with Smooth Regret: Efficient Learning in Continuous Action Spaces
Authors: Yinglun Zhu, Paul Mineiro
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct large-scale empirical evaluations demonstrating the efficacy of our proposed algorithms. |
| Researcher Affiliation | Collaboration | Yinglun Zhu 1 Paul Mineiro 2 1University of Wisconsin-Madison 2Microsoft Research NYC. |
| Pseudocode | Yes | Algorithm 1 Smooth IGW. Algorithm 2 Rejection Sampling for IGW. Algorithm 3 Stable Base Algorithm (Index b). |
| Open Source Code | Yes | Code to reproduce these experiments is available at https://github.com/pmineiro/smoothcb. |
| Open Datasets | Yes | We replicate the real-world dataset experiment from Zhu & Nowak (2020). We replicate the online setting from Majzoubi et al. (2020), where 5 large-scale Open ML regression datasets are converted into continuous action problems. |
| Dataset Splits | No | The paper mentions using datasets but does not explicitly specify the training, validation, and test splits (e.g., percentages or counts) required for reproduction in the main text. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used for its experiments (e.g., specific GPU or CPU models). |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9). |
| Experiment Setup | No | The paper mentions hyperparameter choices were made (e.g., "initial hyperparameter choices", "tune hyperparameters") but does not provide the concrete values of these hyperparameters or other system-level training settings in the main text. |