reproducibilityindex.ai

Nash Regret Guarantees for Linear Bandits

Authors: Ayush Sawarni, Soumyabrata Pal, Siddharth Barman

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	E Experiments We conduct experiments to compare the performance of our algorithm LINNASH with Thompson Sampling on synthetic data.
Researcher Affiliation	Collaboration	Ayush Sawarni Indian Institute of Science Bangalore sawarniayush@gmail.com Soumyabrata Pal Google Research Bangalore soumyabrata@google.com Siddharth Barman Indian Institute of Science Bangalore barman@iisc.ac.in
Pseudocode	Yes	Algorithm 1 Generate Arm Sequence
Open Source Code	No	The paper does not provide explicit statements or links to open-source code for the described methodology.
Open Datasets	No	We conduct experiments to compare the performance of our algorithm LINNASH with Thompson Sampling on synthetic data.
Dataset Splits	No	The paper describes using synthetic data but does not provide specific training, validation, or test dataset splits.
Hardware Specification	No	The paper does not specify any hardware details (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions algorithms used (e.g., LINNASH, Thompson Sampling) but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup	Yes	We fine-tune the parameters of both algorithms and evaluate their performance in the following experimental setup: We fix the ambient dimension d = 80, the number of arms \|X\| = 10000, and the number of rounds T = 50000. Both the unknown parameter vector, θ , and the arm embeddings are sampled from a multivariate Gaussian distribution. Subsequently, the arm embeddings are shifted and normalized to ensure that all mean rewards are non-negative, with the maximum reward mean being set to 0.5. Upon pulling an arm, we observe a Bernoulli random variable with a probability corresponding to its mean reward.