Fast Pure Exploration via Frank-Wolfe
Authors: Po-An Wang, Ruo-Chun Tzeng, Alexandre Proutiere
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply FWS to various pure exploration tasks, including best arm identification in unstructured, thresholded, linear, and Lipschitz bandits. Despite its simplicity, FWS is competitive compared to state-of-art algorithms. Using numerical experiments, we show that FWS is competitive with state-of-the-art algorithms for BAI in unstructured, linear, and Lipschitz bandits, see Appendices D-E-F, respectively. To the best of our knowledge, we report the first results for BAI in Lipschitz bandits. We quote some of our results for BAI in linear bandits below. In Table 1, we present the sample complexity (the number of samples gathered before the algorithm stops) averaged over 1000 runs for the various algorithms and for different confidence levels δ {0.1, 0.01, 0.001, 0.0001}. |
| Researcher Affiliation | Academia | Po-An Wang KTH Royal Institute of Technology Stockholm, Sweden wang9@kth.se Ruo-Chun Tzeng KTH Royal Institute of Technology Stockholm, Sweden rctzeng@kth.se Alexandre Proutiere EECS and Digital Futures KTH, Stockholm, Sweden alepro@kth.se |
| Pseudocode | Yes | Algorithm 1: FWS algorithm Input: Confidence level δ, sequence {rt}t 1 Initialization: Sample each arm once and update ω(K), x(K) = ( 1 K , . . . , 1 K ), and ˆµ(K) t K While (t Fˆµ(t)(ω(t)) < β(δ, t) or ˆµ(t 1) / Λ) t t + 1 If ( p t/K N or ˆµ(t 1) / Λ) (forced exploration) z(t) ( 1 K , . . . , 1 K ) Else (FW update) z(t) argmax z Σ min h HFˆ µ(t 1)(x(t 1),rt) z x(t 1), h (ties broken arbitrarily) Update x(t) t 1 t x(t 1) + 1 t z(t) Sample the arm At argmaxk xk(t)/ωk(t 1) (ties broken arbitrarily) Update ω(t) and ˆµ(t) Output: i (ˆµ(t)) |
| Open Source Code | No | The paper references an external GitHub link for a comparison algorithm, LinBAI.jl, but does not provide concrete access to the source code for the FWS algorithm itself. |
| Open Datasets | Yes | We consider the example proposed by [35]. The unknown parameter µ = e1 and there are d + 1 arms, e1, , ed, cos(φ)e1 + sin(φ)e2 in Rd, where (e1, , ed) form the standard orthonormal basis. We set d = 6 and φ = 0.1. |
| Dataset Splits | No | The paper mentions numerical experiments using an example from [35] and averaging over 1000 runs, but does not specify dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper does not provide any specific hardware details used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | We consider the example proposed by [35]. The unknown parameter µ = e1 and there are d + 1 arms, e1, , ed, cos(φ)e1 + sin(φ)e2 in Rd, where (e1, , ed) form the standard orthonormal basis. We set d = 6 and φ = 0.1. Except for XY-A, all algorithms implement the same stopping rule defined in (7) with threshold β(t, δ) = log((log(t) + 1)/δ) (this threshold was initially suggested in [13], and is also used in [34] for CG-C and Lk-C). For XY-A, we use the stopping rule advocated in the corresponding papers. Refer to Appendix E for the detailed implementations. |