Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Ensemble Sampling

Authors: Xiuyuan Lu, Benjamin Van Roy

NeurIPS 2017 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We establish a theoretical basis that supports the approach and present computational results that offer further insight. In this section, we present computational results that demonstrate viability of ensemble sampling.
Researcher Affiliation Academia Xiuyuan Lu Stanford University EMAIL Benjamin Van Roy Stanford University EMAIL
Pseudocode Yes Algorithm 1 Ensemble Sampling 1: Sample: θ0,1,...,θ0,M p0 2: for t = 0,...,T 1 do 3: Sample: m unif({1,...,M}) 4: Act: At = arg max a A Rˆθt(a) 5: Observe: Yt+1 6: Update: θt+1,1,...,θt+1,M 7: end for
Open Source Code No The paper does not provide an explicit statement or link for the availability of its source code.
Open Datasets No The paper uses synthetic data generated according to specified distributions and parameters (e.g., Gaussian bandits and neural networks with specific noise and prior distributions) rather than relying on named, publicly available datasets with explicit access information.
Dataset Splits No The paper does not explicitly provide specific details about train/validation/test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) needed for reproducibility.
Hardware Specification No The paper does not specify any particular hardware components (e.g., CPU, GPU models, or memory) used for running the experiments.
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9').
Experiment Setup Yes We set the input dimension N = 100, number of actions K = 100, prior variance λ = 10, and noise variance σ2 z = 100. We used N = 100 for the input dimension, D = 50 for the dimension of the hidden layer, number of actions K = 100, prior variance λ = 1, and noise variance σ2 z = 100. We took 3 stochastic gradient steps with a minibatch size of 64 for each model update. We used a learning rate of 1e-1 for ϵ-greedy and ensemble sampling, and a learning rate of 1e-2, 1e-2, 2e-2, and 5e-2 for dropout with dropping probabilities 0.25, 0.5, 0.75, and 0.9 respectively.