Combinatorial Pure Exploration with Full-Bandit or Partial Linear Feedback

Authors: Yihan Du, Yuko Kuroki, Wei Chen7262-7270

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our empirical evaluation demonstrates that our algorithms run orders of magnitude faster than the existing ones, and our CPE-BL algorithm is robust across different min settings while our CPE-PL algorithm is the first one returning correct answers for nonlinear reward functions.
Researcher Affiliation Collaboration Yihan Du,1 Yuko Kuroki,2 Wei Chen3 1IIIS, Tsinghua University, 2The University of Tokyo, RIKEN, 3Microsoft Research duyh18@mails.tsinghua.edu.cn, ykuroki@ms.k.u-tokyo.ac.jp, weic@microsoft.com
Pseudocode Yes Algorithm 1: ALBA(S, δ) (Tao, Blanco, and Zhou 2018), Algorithm 2: Elim Tilp(S, δ), Algorithm 3: Vector Est(λ, n), Algorithm 4: Poly ALBA, Algorithm 5: Computing a distribution λ, Algorithm 6: GCB-PE
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets No The paper describes generating data: 'θ1, . . . , θd is set as a geometric sequence in [0, 1]. We simulate the random feedback for action x by a Gaussian distribution with mean of x θ and unit variance.' This indicates a simulated or synthetic dataset, not a publicly available one with concrete access information.
Dataset Splits No The paper does not provide specific dataset split information (e.g., percentages for train/validation/test, specific sample counts, or citations to predefined splits) to reproduce data partitioning.
Hardware Specification Yes We evaluate all the compared algorithms on Intel Xeon E5-2640 v3 CPU at 2.60GHz with 132GB RAM.
Software Dependencies No The paper does not provide specific software dependency details with version numbers.
Experiment Setup No The paper does not provide concrete hyperparameter values or detailed training configurations in the main text that would be typical for machine learning experiments (e.g., learning rate, batch size, epochs).