The Pareto Regret Frontier for Bandits

Authors: Tor Lattimore

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental I compare MOSS and unbalanced MOSS in two simple simulated examples, both with horizon n = 5000. Each data point is an empirical average of 104 i.i.d. samples, so error bars are too small to see. Code/data is available in the supplementary material.
Researcher Affiliation Academia Tor Lattimore Department of Computing Science University of Alberta, Canada tor.lattimore@gmail.com
Pseudocode Yes Algorithm 1: Unbalanced MOSS
Open Source Code Yes Code/data is available in the supplementary material.
Open Datasets No The paper describes experiments with 'simulated examples' using parameters like 'K = 2 arms' and 'µ = (0, )', implying synthetic data generation rather than the use of a named public dataset with access information.
Dataset Splits No The paper discusses 'simulated examples' and 'empirical average of 104 i.i.d. samples,' but does not specify any training, validation, or test dataset splits.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., CPU, GPU models) used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies or version numbers.
Experiment Setup Yes I compare MOSS and unbalanced MOSS in two simple simulated examples, both with horizon n = 5000. Each data point is an empirical average of 104 i.i.d. samples... The first experiment has K = 2 arms and B1 = n 1 3 and B2 = n 2 3 . I plotted the results for µ = (0, ) for varying . The second experiment has K = 10 arms. This time B1 = n and Bk = (k 1)H n with H = P9 k=1 1/k. Results are shown for µk = 1{k = i } for [0, 1/2].