Combinatorial Multi-Armed Bandit with General Reward Functions
Authors: Wei Chen, Wei Hu, Fu Li, Jian Li, Yu Liu, Pinyan Lu
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Even when we use the simple greedy algorithm as the oracle, our experiments show that SDCB performs significantly better than the algorithm in [26] (see the supplementary material). |
| Researcher Affiliation | Collaboration | Microsoft Research, email: weic@microsoft.com. The authors are listed in alphabetical order. Princeton University, email: huwei@cs.princeton.edu. The University of Texas at Austin, email: fuli.theory.research@gmail.com. Tsinghua University, email: lapordge@gmail.com. Tsinghua University, email: liuyujyyz@gmail.com. Shanghai University of Finance and Economics, email: lu.pinyan@mail.shufe.edu.cn. |
| Pseudocode | Yes | Algorithm 1 SDCB (Stochastically dominant confidence bound). Algorithm 2 Lazy-SDCB with known time horizon. Algorithm 3 Lazy-SDCB without knowing the time horizon. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or a link to a code repository for the methodology described. |
| Open Datasets | No | The paper mentions applications like 'online advertising, online recommendation, wireless routing' and discusses 'K-MAX problem' and 'Expected Utility Maximization problems', but it does not specify any particular datasets used for its experiments or provide information on their public availability. |
| Dataset Splits | No | The paper does not provide specific details on train/validation/test dataset splits. While it mentions 'our experiments', it does not specify any dataset or how it was split. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run its experiments (e.g., GPU/CPU models, memory specifications). |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., libraries, frameworks, or specific solvers). |
| Experiment Setup | No | The paper is theoretical and focuses on algorithm design and proofs. It does not provide specific experimental setup details such as hyperparameters, learning rates, batch sizes, or training configurations. |