reproducibilityindex.ai

Neural Thompson Sampling

Authors: Weitong ZHANG, Dongruo Zhou, Lihong Li, Quanquan Gu

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental comparisons with other benchmark bandit algorithms on various data sets corroborate our theory. and Finally, we corroborate the analysis with an empirical evaluation of the algorithm on several benchmarks. Experiments show that Neural TS yields competitive performance, in comparison with state-of-the-art baselines, thus suggest its practical value in addition to strong theoretical guarantees.
Researcher Affiliation	Collaboration	Weitong Zhang Department of Computer Science University of California, Los Angeles Los Angeles, CA, USA, 90095 wt.zhang@ucla.edu Dongruo Zhou Department of Computer Science University of California, Los Angeles Los Angeles, CA, USA, 90095 drzhou@cs.ucla.edu Lihong Li Google Research USA lihong@google.com Quanquan Gu Department of Computer Science University of California, Los Angeles Los Angeles, CA, USA, 90095 qgu@cs.ucla.edu
Pseudocode	Yes	Algorithm 1 Neural Thompson Sampling (Neural TS)
Open Source Code	No	The paper does not provide concrete access to source code for the methodology.
Open Datasets	Yes	This section gives an empirical evaluation of our algorithm in several public benchmark datasets, including adult, covertype, magic telescope, mushroom and shuttle, all from UCI (Dua & Graff, 2017), as well as MNIST (Le Cun et al., 2010).
Dataset Splits	No	The paper does not explicitly provide training/test/validation dataset splits with percentages, absolute counts, or specific predefined splits. It mentions using public datasets and reshuffling data for repeated runs.
Hardware Specification	No	The paper does not provide specific hardware details used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers.
Experiment Setup	Yes	One-hidden-layer neural networks with 100 neurons are used. Note that we do not choose m as suggested by theory, and such a disconnection has its root in the current deep learning theory based on neural tangent kernel, which is not speciﬁc in this work. During posterior updating, gradient descent is run for 100 iterations with learning rate 0.001. and We set the time horizon of our algorithm to 10 000 for all data sets, except for mushroom which contains only 8 124 data. and For the Neural UCB / Thompson Sampling methods, we use a grid search on λ {1, 10 1, 10 2, 10 3} and ν {10 1, 10 2, 10 3, 10 4, 10 5}.