Thompson Sampling and Approximate Inference

Authors: My Phan, Yasin Abbasi Yadkori, Justin Domke

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform the following simulations 1000 times and plot the mean cumulative regret up to time T = 100 in Figure 2b using three different policies
Researcher Affiliation Collaboration My Phan College of Information and Computer Science University of Massachusetts Amherst, MA myphan@cs.umass.edu; Yasin Abbasi-Yadkori Vin AI Hanoi, Vietnam yasin.abbasi@gmail.com; Justin Domke College of Information and Computer Science University of Massachusetts Amherst, MA domke@cs.umass.edu
Pseudocode No The paper describes algorithmic steps but does not include structured pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide any concrete access information (e.g., repository link, explicit release statement) for source code related to the described methodology.
Open Datasets No The paper describes how data for simulations are generated internally (e.g., 'reward distributions are Norm(0.6, 0.22) and Norm(0.5, 0.22)', 'the prior is Norm(0, Σ0)'), rather than referencing a publicly available or open dataset with access information.
Dataset Splits No The paper does not provide specific dataset split information (e.g., percentages, sample counts, or detailed splitting methodology for train/validation/test sets) needed to reproduce data partitioning.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions general software packages like Stan, Edward, Py MC, and infer.NET as examples of implementations, but does not provide specific software dependencies (e.g., library or solver names with version numbers) used for their own experiments.
Experiment Setup Yes We set the exploration rate at time t to be 1/t, T = 100 and show the results in Figure 3a and discuss them in Section 6.3.