Optimal Best-Arm Identification Methods for Tail-Risk Measures
Authors: Shubhada Agrawal, Wouter M. Koolen, Sandeep Juneja
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Figure 1: Histogram of stopping times among 1000 runs on 3 arms, as a function of confidence δ. Vertical bars (solid) indicate the lower bound (4), and (dashed) a version adjusted to our stopping threshold (7), i.e., the n that solves n = β(n, δ)V (µ)-1. In our experiments we implement a version of Track-and-Stop including C-tracking and forced exploration and apply it to Fisher-Tippett (F(µ, σ, γ)), Pareto (P(µ, σ, γ)), and mixtures of Fisher Tippett arms (these heavy-tailed distributions arise in extreme value theory). |
| Researcher Affiliation | Academia | Shubhada Agrawal TIFR, Mumbai, India shubhadaiitd@gmail.com Wouter M. Koolen CWI, Amsterdam wmkoolen@cwi.nl Sandeep Juneja TIFR, Mumbai, India juneja@tifr.res.in |
| Pseudocode | No | The paper describes the algorithm components (sampling rule, stopping rule, recommendation rule) in text but does not provide a structured pseudocode or algorithm block. |
| Open Source Code | No | The paper does not provide concrete access to source code or explicitly state that it is open-source. |
| Open Datasets | No | In our experiments we implement a version of Track-and-Stop including C-tracking and forced exploration and apply it to Fisher-Tippett (F(µ, σ, γ)), Pareto (P(µ, σ, γ)), and mixtures of Fisher Tippett arms (these heavy-tailed distributions arise in extreme value theory). Figure 1 shows the distribution of the stopping time as a function of δ in a synthetic three-arm task. No publicly available datasets are mentioned or linked. |
| Dataset Splits | No | The paper describes a synthetic multi-armed bandit task where samples are generated sequentially by the algorithm; therefore, traditional dataset splits (training, validation, test) are not applicable or mentioned. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers needed for replication. |
| Experiment Setup | Yes | We select ϵ = 0.7 and B = 4.5. This is a moderately hard problem of complexity V -1(µ) = 49.7. We conclude that even at moderate δ the average sample complexity closely matches the lower bound, especially after adjusting it for the lowerorder terms in the employed stopping threshold β(n, δ). |