Combinatorial Pure Exploration with Full-Bandit or Partial Linear Feedback
Authors: Yihan Du, Yuko Kuroki, Wei Chen7262-7270
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical evaluation demonstrates that our algorithms run orders of magnitude faster than the existing ones, and our CPE-BL algorithm is robust across different min settings while our CPE-PL algorithm is the first one returning correct answers for nonlinear reward functions. |
| Researcher Affiliation | Collaboration | Yihan Du,1 Yuko Kuroki,2 Wei Chen3 1IIIS, Tsinghua University, 2The University of Tokyo, RIKEN, 3Microsoft Research duyh18@mails.tsinghua.edu.cn, ykuroki@ms.k.u-tokyo.ac.jp, weic@microsoft.com |
| Pseudocode | Yes | Algorithm 1: ALBA(S, δ) (Tao, Blanco, and Zhou 2018), Algorithm 2: Elim Tilp(S, δ), Algorithm 3: Vector Est(λ, n), Algorithm 4: Poly ALBA, Algorithm 5: Computing a distribution λ, Algorithm 6: GCB-PE |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | No | The paper describes generating data: 'θ1, . . . , θd is set as a geometric sequence in [0, 1]. We simulate the random feedback for action x by a Gaussian distribution with mean of x θ and unit variance.' This indicates a simulated or synthetic dataset, not a publicly available one with concrete access information. |
| Dataset Splits | No | The paper does not provide specific dataset split information (e.g., percentages for train/validation/test, specific sample counts, or citations to predefined splits) to reproduce data partitioning. |
| Hardware Specification | Yes | We evaluate all the compared algorithms on Intel Xeon E5-2640 v3 CPU at 2.60GHz with 132GB RAM. |
| Software Dependencies | No | The paper does not provide specific software dependency details with version numbers. |
| Experiment Setup | No | The paper does not provide concrete hyperparameter values or detailed training configurations in the main text that would be typical for machine learning experiments (e.g., learning rate, batch size, epochs). |