Contextual Bandits with Smooth Regret: Efficient Learning in Continuous Action Spaces

Authors: Yinglun Zhu, Paul Mineiro

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct large-scale empirical evaluations demonstrating the efficacy of our proposed algorithms.
Researcher Affiliation Collaboration Yinglun Zhu 1 Paul Mineiro 2 1University of Wisconsin-Madison 2Microsoft Research NYC.
Pseudocode Yes Algorithm 1 Smooth IGW. Algorithm 2 Rejection Sampling for IGW. Algorithm 3 Stable Base Algorithm (Index b).
Open Source Code Yes Code to reproduce these experiments is available at https://github.com/pmineiro/smoothcb.
Open Datasets Yes We replicate the real-world dataset experiment from Zhu & Nowak (2020). We replicate the online setting from Majzoubi et al. (2020), where 5 large-scale Open ML regression datasets are converted into continuous action problems.
Dataset Splits No The paper mentions using datasets but does not explicitly specify the training, validation, and test splits (e.g., percentages or counts) required for reproduction in the main text.
Hardware Specification No The paper does not explicitly describe the hardware used for its experiments (e.g., specific GPU or CPU models).
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9).
Experiment Setup No The paper mentions hyperparameter choices were made (e.g., "initial hyperparameter choices", "tune hyperparameters") but does not provide the concrete values of these hyperparameters or other system-level training settings in the main text.