reproducibilityindex.ai

Fast Pure Exploration via Frank-Wolfe

Authors: Po-An Wang, Ruo-Chun Tzeng, Alexandre Proutiere

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply FWS to various pure exploration tasks, including best arm identiﬁcation in unstructured, thresholded, linear, and Lipschitz bandits. Despite its simplicity, FWS is competitive compared to state-of-art algorithms. Using numerical experiments, we show that FWS is competitive with state-of-the-art algorithms for BAI in unstructured, linear, and Lipschitz bandits, see Appendices D-E-F, respectively. To the best of our knowledge, we report the ﬁrst results for BAI in Lipschitz bandits. We quote some of our results for BAI in linear bandits below. In Table 1, we present the sample complexity (the number of samples gathered before the algorithm stops) averaged over 1000 runs for the various algorithms and for different conﬁdence levels δ {0.1, 0.01, 0.001, 0.0001}.
Researcher Affiliation	Academia	Po-An Wang KTH Royal Institute of Technology Stockholm, Sweden wang9@kth.se Ruo-Chun Tzeng KTH Royal Institute of Technology Stockholm, Sweden rctzeng@kth.se Alexandre Proutiere EECS and Digital Futures KTH, Stockholm, Sweden alepro@kth.se
Pseudocode	Yes	Algorithm 1: FWS algorithm Input: Conﬁdence level δ, sequence {rt}t 1 Initialization: Sample each arm once and update ω(K), x(K) = ( 1 K , . . . , 1 K ), and ˆµ(K) t K While (t Fˆµ(t)(ω(t)) < β(δ, t) or ˆµ(t 1) / Λ) t t + 1 If ( p t/K N or ˆµ(t 1) / Λ) (forced exploration) z(t) ( 1 K , . . . , 1 K ) Else (FW update) z(t) argmax z Σ min h HFˆ µ(t 1)(x(t 1),rt) z x(t 1), h (ties broken arbitrarily) Update x(t) t 1 t x(t 1) + 1 t z(t) Sample the arm At argmaxk xk(t)/ωk(t 1) (ties broken arbitrarily) Update ω(t) and ˆµ(t) Output: i (ˆµ(t))
Open Source Code	No	The paper references an external GitHub link for a comparison algorithm, LinBAI.jl, but does not provide concrete access to the source code for the FWS algorithm itself.
Open Datasets	Yes	We consider the example proposed by [35]. The unknown parameter µ = e1 and there are d + 1 arms, e1, , ed, cos(φ)e1 + sin(φ)e2 in Rd, where (e1, , ed) form the standard orthonormal basis. We set d = 6 and φ = 0.1.
Dataset Splits	No	The paper mentions numerical experiments using an example from [35] and averaging over 1000 runs, but does not specify dataset splits for training, validation, or testing.
Hardware Specification	No	The paper does not provide any specific hardware details used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers.
Experiment Setup	Yes	We consider the example proposed by [35]. The unknown parameter µ = e1 and there are d + 1 arms, e1, , ed, cos(φ)e1 + sin(φ)e2 in Rd, where (e1, , ed) form the standard orthonormal basis. We set d = 6 and φ = 0.1. Except for XY-A, all algorithms implement the same stopping rule deﬁned in (7) with threshold β(t, δ) = log((log(t) + 1)/δ) (this threshold was initially suggested in [13], and is also used in [34] for CG-C and Lk-C). For XY-A, we use the stopping rule advocated in the corresponding papers. Refer to Appendix E for the detailed implementations.