Combinatorial Pure Exploration with Bottleneck Reward Function
Authors: Yihan Du, Yuko Kuroki, Wei Chen
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental results on the top-k, path and matching instances validate the empirical superiority of the proposed algorithms over their baselines. |
| Researcher Affiliation | Collaboration | Yihan Du IIIS, Tsinghua University Beijing, China duyh18@mails.tsinghua.edu.cn Yuko Kuroki The University of Tokyo / RIKEN Tokyo, Japan yukok@is.s.u-tokyo.ac.jp Wei Chen Microsoft Research Beijing, China weic@microsoft.com |
| Pseudocode | Yes | Algorithm 1 BLUCB, algorithm for CPE-B in the FC setting |
| Open Source Code | No | The paper does not include any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | In terms of the real-world dataset, we use the data of American airports and the number of available seats of flights in 2002, provided by the International Air Transportation Association database (www.iata.org) [6]. |
| Dataset Splits | No | The paper describes the characteristics of the datasets used but does not provide specific details on how these datasets were split into training, validation, or test sets (e.g., percentages or sample counts). |
| Hardware Specification | No | The paper discusses time complexity but does not provide any specific hardware details such as GPU models, CPU models, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies, libraries, or solvers used in the experiments. |
| Experiment Setup | Yes | In the FC setting, we set a large δ = 0.005 and a small δ = exp( 1000), and perform 50 independent runs to plot average sample complexity with 95% confidence intervals. In the FB setting, we set sample budget T [6000, 15000], and perform 3000 independent runs to show the error probability across runs. For all experiments, the random reward of each edge e [n] is i.i.d. drawn from Gaussian distribution N(w(e), 1). |