Multi-armed Bandit Requiring Monotone Arm Sequences
Authors: Ningyuan Chen
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we conduct numerical studies to compare the regret of Algorithm 1 with a few benchmarks. The experiments are conducted on a Windows 10 desktop with Intel i9-10900K CPU.The average regret over the 100 instances is shown in Figure 3. |
| Researcher Affiliation | Academia | Ningyuan Chen Rotman School of Management, University of Toronto 105 St George St, Toronto, ON, Canada ningyuan.chen@utoronto.ca |
| Pseudocode | Yes | Algorithm 1 Increasing arm sequence |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | No | For each instance, we generate x1 U(0, 1) and x2 U(0.5, 1), and let f(x) be the linear interpolation of the three points (0, 0), (x1, x2), and (1, 0). The reward Zt N(f(Xt), 0.1) for all instances and T. (The authors generated their own data for the numerical experiments, and no access information is provided for this generated data.) |
| Dataset Splits | No | The paper does not specify any training/validation/test dataset splits. It describes generating instances and running simulations for a given horizon T. |
| Hardware Specification | Yes | The experiments are conducted on a Windows 10 desktop with Intel i9-10900K CPU. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers. |
| Experiment Setup | Yes | By choosing K = T 1/4 and m = T 1/2 , the regret of Algorithm 1 satisfies... and As stated in Theorem 2, we use K = T 1/4 , m = T 1/2 and σ = 0.1. |