Adaptive Multiple-Arm Identification
Authors: Jiecao Chen, Xi Chen, Qin Zhang, Yuan Zhou
ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we present the experimental results. While our theorems are presented in the PAC form, it is in general difficult to verify them directly because the parameter ϵ is merely an upper bound and the actual aggregate regret may deviate from it. In our experiment, we convert our Algorithm 1 to the fixed-budget version (that is, fix the budget of the number of pulls and calculate the aggregate regret). We compare our Algorithm 1 (Adaptive Top K ) with two stateof-the-art methods Opt MAI in Zhou et al. (2014) and CLUCB-PAC in Chen et al. (2014). We test our algorithm on both synthetic and real datasets |
| Researcher Affiliation | Academia | 1Computer Science Department, Indiana University, Bloomington, IN, USA 2Stern School of Business, New York University, New York, NY, USA. |
| Pseudocode | Yes | Algorithm 1 ADAPTIVETOPK(n, ϵ, K, δ) |
| Open Source Code | No | The paper does not provide a direct link to the source code for the methodology described, nor does it explicitly state that the code is open-source or available. |
| Open Datasets | No | We test our algorithm on both synthetic and real datasets (due to space constraints, our results on real datasets are deferred to the full version of this paper) as described as follows. TWOGROUP: the mean reward for the top K arms is set to 0.7 and that for the rest of the arms is set to 0.3. UNIFORM: we set θi = 1 i /n for 1 i n. SYNTHETIC-p: we set θi = (1 K /n)p for each i K and θi = (1 K /n)p for each i > K. |
| Dataset Splits | No | For each dataset, we first fix the budget (total number of pulls allowed) and run each algorithm 200 times. For each algorithm, we calculate the empirical probability (over 200 runs) that the aggregate regret of the selected arms is above the tolerance threshold ϵ = 0.01, which is called failure probability. A smaller failure probability means better performance. For each dataset and different K, we plot the curve of failure probability by varying the number of pulls. The paper does not describe typical training/validation/test splits for the datasets. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | We set the total number of arms n = 1, 000, the tolerance parameter ϵ = 0.01, and vary the parameter K. In Adaptive Top K and CLUCB-PAC, another parameter δ (i.e., the failure probability) is required and we set δ = 0.01. |