Ensemble Sampling
Authors: Xiuyuan Lu, Benjamin Van Roy
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We establish a theoretical basis that supports the approach and present computational results that offer further insight. In this section, we present computational results that demonstrate viability of ensemble sampling. |
| Researcher Affiliation | Academia | Xiuyuan Lu Stanford University lxy@stanford.edu Benjamin Van Roy Stanford University bvr@stanford.edu |
| Pseudocode | Yes | Algorithm 1 Ensemble Sampling 1: Sample: θ0,1,...,θ0,M p0 2: for t = 0,...,T 1 do 3: Sample: m unif({1,...,M}) 4: Act: At = arg max a A Rˆθt(a) 5: Observe: Yt+1 6: Update: θt+1,1,...,θt+1,M 7: end for |
| Open Source Code | No | The paper does not provide an explicit statement or link for the availability of its source code. |
| Open Datasets | No | The paper uses synthetic data generated according to specified distributions and parameters (e.g., Gaussian bandits and neural networks with specific noise and prior distributions) rather than relying on named, publicly available datasets with explicit access information. |
| Dataset Splits | No | The paper does not explicitly provide specific details about train/validation/test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) needed for reproducibility. |
| Hardware Specification | No | The paper does not specify any particular hardware components (e.g., CPU, GPU models, or memory) used for running the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9'). |
| Experiment Setup | Yes | We set the input dimension N = 100, number of actions K = 100, prior variance λ = 10, and noise variance σ2 z = 100. We used N = 100 for the input dimension, D = 50 for the dimension of the hidden layer, number of actions K = 100, prior variance λ = 1, and noise variance σ2 z = 100. We took 3 stochastic gradient steps with a minibatch size of 64 for each model update. We used a learning rate of 1e-1 for ϵ-greedy and ensemble sampling, and a learning rate of 1e-2, 1e-2, 2e-2, and 5e-2 for dropout with dropping probabilities 0.25, 0.5, 0.75, and 0.9 respectively. |