The Pareto Regret Frontier for Bandits
Authors: Tor Lattimore
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | I compare MOSS and unbalanced MOSS in two simple simulated examples, both with horizon n = 5000. Each data point is an empirical average of 104 i.i.d. samples, so error bars are too small to see. Code/data is available in the supplementary material. |
| Researcher Affiliation | Academia | Tor Lattimore Department of Computing Science University of Alberta, Canada tor.lattimore@gmail.com |
| Pseudocode | Yes | Algorithm 1: Unbalanced MOSS |
| Open Source Code | Yes | Code/data is available in the supplementary material. |
| Open Datasets | No | The paper describes experiments with 'simulated examples' using parameters like 'K = 2 arms' and 'µ = (0, )', implying synthetic data generation rather than the use of a named public dataset with access information. |
| Dataset Splits | No | The paper discusses 'simulated examples' and 'empirical average of 104 i.i.d. samples,' but does not specify any training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU, GPU models) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies or version numbers. |
| Experiment Setup | Yes | I compare MOSS and unbalanced MOSS in two simple simulated examples, both with horizon n = 5000. Each data point is an empirical average of 104 i.i.d. samples... The first experiment has K = 2 arms and B1 = n 1 3 and B2 = n 2 3 . I plotted the results for µ = (0, ) for varying . The second experiment has K = 10 arms. This time B1 = n and Bk = (k 1)H n with H = P9 k=1 1/k. Results are shown for µk = 1{k = i } for [0, 1/2]. |