Linear Multi-Resource Allocation with Semi-Bandit Feedback
Authors: Tor Lattimore, Koby Crammer, Csaba Szepesvari
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present two experiments to demonstrate the behaviour of Algorithm 2. All code and data is available in the supplementary material. |
| Researcher Affiliation | Academia | Tor Lattimore Department of Computing Science University of Alberta, Canada tor.lattimore@gmail.com Koby Crammer Department of Electrical Engineering The Technion, Israel koby@ee.technion.ac.il Csaba Szepesv ari Department of Computing Science University of Alberta, Canada szepesva@ualberta.ca |
| Pseudocode | Yes | Algorithm 1 and Algorithm 2 are presented in the paper. |
| Open Source Code | Yes | All code and data is available in the supplementary material. |
| Open Datasets | Yes | All code and data is available in the supplementary material. For this experiment we used D = K = 2 and n = 106 and ν = 8/10 2/10 4/10 2 and M = 1 0 1/2 1/2 where the kth column is the parameter/allocation for the kth task. We fix n = 5 105 and D = K = 2. For α ∈ (0, 1) we define να = 1/2 α/2 1/2 α/2 and M = 1 0 1 0. |
| Dataset Splits | No | The paper describes a sequential online learning problem and does not explicitly state traditional training/validation/test dataset splits. It refers to a time horizon 'n' and average regret over runs. |
| Hardware Specification | No | The paper does not specify the hardware used for running the experiments (e.g., GPU/CPU models, memory details). |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | For this experiment we used D = K = 2 and n = 106 and ν = 8/10 2/10 4/10 2 and M = 1 0 1/2 1/2 where the kth column is the parameter/allocation for the kth task. We ran two versions of the algorithm. The first, exactly as given in Algorithm 2 and the second identical except that the weights were fixed to γtk = 4 for all t and k (this value is chosen because it corresponds to the minimum inverse variance for a Bernoulli variable). The data was produced by taking the average regret over 8 runs. We fix n = 5 105 and D = K = 2. For α ∈ (0, 1) we define να = 1/2 α/2 1/2 α/2 and M = 1 0 1 0. |