An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits
Authors: Andrea Tirinzoni, Matteo Pirotta, Marcello Restelli, Alessandro Lazaric
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we verify that our algorithm obtains better empirical performance than state-of-the-art baselines. [...] 6 Numerical Simulations We compare SOLID to Lin UCB, Lin TS, and OAM. [...] All plots are the results of 100 runs with 95% Student s t confidence intervals. |
| Researcher Affiliation | Collaboration | Andrea Tirinzoni Politecnico di Milano andrea.tirinzoni@polimi.it Matteo Pirotta Facebook AI Research pirotta@fb.com Marcello Restelli Politecnico di Milano marcello.restelli@polimi.it Alessandro Lazaric Facebook AI Research lazaric@fb.com |
| Pseudocode | Yes | Algorithm 1: SOLID |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | No | The paper mentions 'results on a real dataset' in Appendix K, but Appendix K is not provided in the given text, and no specific access information (link, DOI, formal citation with author/year) is given within the provided text for any dataset, public or otherwise. |
| Dataset Splits | No | The paper does not provide specific details on training, validation, or test dataset splits. For the 'Toy problem' and 'Random problems' used in simulations, data is generated rather than being a pre-split fixed dataset. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies (e.g., library names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | For SOLID, we set βt = σ2(log(t) + d log log(n)) and γt = σ2(log(St) + d log log(n)) (i.e., we remove all numerical constants) and we use the exponential schedule for phases defined in Thm. 2. For OAM, we set forced-exploration ϵ = 0.01 and solve (P) every 100 rounds to speed-up execution as computation becomes prohibitive. |