Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Simple regret for infinitely many armed bandits
Authors: Alexandra Carpentier, Michal Valko
ICML 2015 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5. Numerical simulations To simulate different regimes of the performance according to β-regularity, we consider different reservoir distributions of the arms. In particular, we consider beta distributions B(x, y) with as x = 1 and y = β. For B(1, β), the Assumption 2 is satisfied precisely with regularity β. Since to our best knowledge, Si RI is the first algorithm optimizing simple regret in the infinitely many arms setting, there is no natural competitor for it. Nonetheless, in our experiments we compare to the algorithms designed for linked settings. ... All the experiments have some specific beta distribution as a reservoir and the arm pulls are noised with N(0, 1) truncated to [0, 1]. We perform 3 experiments based on different regimes of β coming from our analysis: β < 2, β = 2, and β > 2. |
| Researcher Affiliation | Academia | Alexandra Carpentier EMAIL Statistical Laboratory, CMS, Wilberforce Road, CB3 0WB, University of Cambridge, United Kingdom Michal Valko EMAIL INRIA Lille Nord Europe, Seque L team, 40 avenue Halley 59650, Villeneuve d Ascq, France |
| Pseudocode | Yes | Algorithm 1 Si RI Simple Regret for Infinitely Many Armed Bandits Parameters: β, C, δ Initial pull of arms from the reservoir: Choose Tβ arms from the reservoir L . Pull each of Tβ arms once. t Tβ Choice between these arms: while t n do For any k Tβ: Bk,t bµk,t + 2 C Tk,t log 22 tβ/b/(Tk,tδ) Tk,t log 22 tβ/b/(Tk,tδ) (4) Pull Tk,t times the arm kt that maximizes Bk,t and receive Tk,t samples from it. t t + Tk,t end while Output: Return the most pulled arm bk. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing the source code or a link to a code repository for the methodology described. |
| Open Datasets | No | The paper describes generating data for its simulations ('we consider beta distributions B(x, y) with as x = 1 and y = β. For B(1, β), the Assumption 2 is satisfied precisely with regularity β. ... All the experiments have some specific beta distribution as a reservoir and the arm pulls are noised with N(0, 1) truncated to [0, 1]'), but it does not provide access information for a publicly available dataset. |
| Dataset Splits | No | The paper describes simulations (e.g., '100 simulations') but does not specify training, validation, or test dataset splits or a methodology for creating them. |
| Hardware Specification | No | The paper does not provide any specific hardware details (like CPU/GPU models or memory) used for running its simulations. |
| Software Dependencies | No | The paper describes algorithms and theoretical results, and includes numerical simulations, but does not specify any software dependencies with version numbers (e.g., programming languages, libraries, or frameworks). |
| Experiment Setup | Yes | In all our experiments, we set constant A of Si RI to 0.3, constant C to 1, and confidence δ to 0.01. |