reproducibilityindex.ai

Optimal Best-Arm Identification Methods for Tail-Risk Measures

Authors: Shubhada Agrawal, Wouter M. Koolen, Sandeep Juneja

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Figure 1: Histogram of stopping times among 1000 runs on 3 arms, as a function of conﬁdence δ. Vertical bars (solid) indicate the lower bound (4), and (dashed) a version adjusted to our stopping threshold (7), i.e., the n that solves n = β(n, δ)V (µ)-1. In our experiments we implement a version of Track-and-Stop including C-tracking and forced exploration and apply it to Fisher-Tippett (F(µ, σ, γ)), Pareto (P(µ, σ, γ)), and mixtures of Fisher Tippett arms (these heavy-tailed distributions arise in extreme value theory).
Researcher Affiliation	Academia	Shubhada Agrawal TIFR, Mumbai, India shubhadaiitd@gmail.com Wouter M. Koolen CWI, Amsterdam wmkoolen@cwi.nl Sandeep Juneja TIFR, Mumbai, India juneja@tifr.res.in
Pseudocode	No	The paper describes the algorithm components (sampling rule, stopping rule, recommendation rule) in text but does not provide a structured pseudocode or algorithm block.
Open Source Code	No	The paper does not provide concrete access to source code or explicitly state that it is open-source.
Open Datasets	No	In our experiments we implement a version of Track-and-Stop including C-tracking and forced exploration and apply it to Fisher-Tippett (F(µ, σ, γ)), Pareto (P(µ, σ, γ)), and mixtures of Fisher Tippett arms (these heavy-tailed distributions arise in extreme value theory). Figure 1 shows the distribution of the stopping time as a function of δ in a synthetic three-arm task. No publicly available datasets are mentioned or linked.
Dataset Splits	No	The paper describes a synthetic multi-armed bandit task where samples are generated sequentially by the algorithm; therefore, traditional dataset splits (training, validation, test) are not applicable or mentioned.
Hardware Specification	No	The paper does not explicitly describe the hardware used for running its experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers needed for replication.
Experiment Setup	Yes	We select ϵ = 0.7 and B = 4.5. This is a moderately hard problem of complexity V -1(µ) = 49.7. We conclude that even at moderate δ the average sample complexity closely matches the lower bound, especially after adjusting it for the lowerorder terms in the employed stopping threshold β(n, δ).