Latent Bandits.
Authors: Odalric-Ambrym Maillard, Shie Mannor
ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments show that, in the most challenging agnostic case, the proposed algorithm achieves excellent performance in several difficult scenarios. |
| Researcher Affiliation | Academia | Odalric-Ambrym Maillard ODALRIC-AMBRYM.MAILLARD@ENS-CACHAN.ORG The Technion, Faculty of Electrical Engineering 32000 Haifa, ISRAEL Shie Mannor SHIE@EE.TECHNION.AC.IL The Technion, Faculty of Electrical Engineering 32000 Haifa, ISRAEL |
| Pseudocode | Yes | Algorithm 1 The Single-K-UCB algorithm. ... Algorithm 2 The Multiple-K-UCB algorithm. ... Algorithm 3 The UCB on B algorithm ... Algorithm 4 The A-UCB algorithm |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. It links to an extended version of the paper for proofs. |
| Open Datasets | No | The paper describes generating data for its experiments based on Bernoulli distributions and specified parameters for |A|, |B|, |C|, and Υ(b), but it does not use a named public dataset or provide access information for a generated dataset. |
| Dataset Splits | No | The paper does not provide specific dataset split information (e.g., train/validation/test percentages or counts). It describes the parameters for the generated environments for numerical experiments. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiments. |
| Experiment Setup | Yes | For each experiment, we show the number of actions |A|, of users |B|, of classes |C|, and the parameters {µa,c}a A,c C when there are not too many. We plot the regret of all algorithms on the same figure: A thick line is used for the mean regret and dashed lines for quantiles at levels 0.25, 0.5, 0.75, 0.95 and 0.99. In all experiments, the parameters {Υ(b)}b B are defined by Υ(b) = wb/ P b B wb, where the weights wb are drawn uniformly randomly in [0.1, 0.9]. Thus for each class, the distortion factor γc is less than 9, and we set the parameter γ of A-UCB to the value γ = 9. For one experiment with given fixed parameters, the algorithms are run over several trials (500) for a large time horizon N = 25000. |