An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits

Authors: Andrea Tirinzoni, Matteo Pirotta, Marcello Restelli, Alessandro Lazaric

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we verify that our algorithm obtains better empirical performance than state-of-the-art baselines. [...] 6 Numerical Simulations We compare SOLID to Lin UCB, Lin TS, and OAM. [...] All plots are the results of 100 runs with 95% Student s t confidence intervals.
Researcher Affiliation Collaboration Andrea Tirinzoni Politecnico di Milano andrea.tirinzoni@polimi.it Matteo Pirotta Facebook AI Research pirotta@fb.com Marcello Restelli Politecnico di Milano marcello.restelli@polimi.it Alessandro Lazaric Facebook AI Research lazaric@fb.com
Pseudocode Yes Algorithm 1: SOLID
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets No The paper mentions 'results on a real dataset' in Appendix K, but Appendix K is not provided in the given text, and no specific access information (link, DOI, formal citation with author/year) is given within the provided text for any dataset, public or otherwise.
Dataset Splits No The paper does not provide specific details on training, validation, or test dataset splits. For the 'Toy problem' and 'Random problems' used in simulations, data is generated rather than being a pre-split fixed dataset.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory specifications) used for running its experiments.
Software Dependencies No The paper does not provide specific software dependencies (e.g., library names with version numbers) needed to replicate the experiment.
Experiment Setup Yes For SOLID, we set βt = σ2(log(t) + d log log(n)) and γt = σ2(log(St) + d log log(n)) (i.e., we remove all numerical constants) and we use the exponential schedule for phases defined in Thm. 2. For OAM, we set forced-exploration ϵ = 0.01 and solve (P) every 100 rounds to speed-up execution as computation becomes prohibitive.