Combinatorial Pure Exploration with Bottleneck Reward Function

Authors: Yihan Du, Yuko Kuroki, Wei Chen

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experimental results on the top-k, path and matching instances validate the empirical superiority of the proposed algorithms over their baselines.
Researcher Affiliation Collaboration Yihan Du IIIS, Tsinghua University Beijing, China duyh18@mails.tsinghua.edu.cn Yuko Kuroki The University of Tokyo / RIKEN Tokyo, Japan yukok@is.s.u-tokyo.ac.jp Wei Chen Microsoft Research Beijing, China weic@microsoft.com
Pseudocode Yes Algorithm 1 BLUCB, algorithm for CPE-B in the FC setting
Open Source Code No The paper does not include any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes In terms of the real-world dataset, we use the data of American airports and the number of available seats of flights in 2002, provided by the International Air Transportation Association database (www.iata.org) [6].
Dataset Splits No The paper describes the characteristics of the datasets used but does not provide specific details on how these datasets were split into training, validation, or test sets (e.g., percentages or sample counts).
Hardware Specification No The paper discusses time complexity but does not provide any specific hardware details such as GPU models, CPU models, or memory specifications used for running the experiments.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies, libraries, or solvers used in the experiments.
Experiment Setup Yes In the FC setting, we set a large δ = 0.005 and a small δ = exp( 1000), and perform 50 independent runs to plot average sample complexity with 95% confidence intervals. In the FB setting, we set sample budget T [6000, 15000], and perform 3000 independent runs to show the error probability across runs. For all experiments, the random reward of each edge e [n] is i.i.d. drawn from Gaussian distribution N(w(e), 1).