Prediction with Limited Advice and Multiarmed Bandits with Paid Observations

Authors: Yevgeny Seldin, Peter Bartlett, Koby Crammer, Yasin Abbasi-Yadkori

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We present an algorithm that achieves N M T ln N regret on T rounds of this game. ... We present an algorithm that achieves O (c N ln N)1/3 T 2/3 + T ln N regret on T rounds of this game in the worst case. Furthermore, we present a number of refinements that treat armand time-dependent observation costs and achieve lower regret under benign conditions. We present lower bounds that show that, apart from the logarithmic factors, the worst-case regret bounds cannot be improved. More illuminating proofs are provided in Section 3, whereas more technical results are provided in the appendix.
Researcher Affiliation Academia Yevgeny Seldin YEVGENY.SELDIN@GMAIL.COM Queensland University of Technology and UC Berkeley Peter Bartlett BARTLETT@EECS.BERKELEY.EDU UC Berkeley and Queensland University of Technology Koby Crammer KOBY@EE.TECHNION.AC.IL The Technion Yasin Abbasi-Yadkori YASIN.ABBASI@GMAIL.COM Queensland University of Technology and UC Berkeley
Pseudocode Yes Algorithm 1 Prediction with limited advice. Algorithm 2 Multiarmed Bandits with Paid Observations.
Open Source Code No The paper does not contain any statements about releasing source code, nor does it provide links to a code repository.
Open Datasets No The paper is theoretical and does not involve empirical evaluation on datasets, so there is no mention of public or open datasets.
Dataset Splits No The paper is theoretical and does not involve empirical experiments with data, thus no dataset splits for training, validation, or testing are mentioned.
Hardware Specification No The paper focuses on theoretical contributions (algorithms, proofs, bounds) and does not describe any empirical experiments that would require specific hardware. Therefore, no hardware specifications are mentioned.
Software Dependencies No The paper is theoretical and does not discuss implementation details or empirical experiments that would require specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and presents algorithms and their theoretical bounds, without describing any empirical experimental setups, hyperparameter values, or system-level training settings.