Centroid Approximation for Bootstrap: Improving Particle Quality at Inference
Authors: Mao Ye, Qiang Liu
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we show that our method can boost the performance of bootstrap in a variety of applications. ... Empirically, we apply the centroid approximation method to various applications, including confidence interval estimation (Di Ciccio et al., 1996), bootstrap method for contextual bandit (Riquelme et al., 2018), bootstrap deep Q-network (Osband et al., 2016) and bagging method (Breiman, 1996) for neural networks. We find that our method consistently improves over the standard bootstrap. |
| Researcher Affiliation | Academia | Mao Ye 1 Qiang Liu 1 1Department of Computer Science, University of Texas at Austin. |
| Pseudocode | Yes | Algorithm 1 Ideal algorithm for centroid approximation with full-batch gradient and wh updated every iteration. ... Algorithm 2 Practical implementation of centroid approximation with less frequent updating of wh and stochastic gradient enabled. |
| Open Source Code | Yes | Code is available at https://github.com/lushleaf/centroid_approximation. |
| Open Datasets | Yes | We consider three datasets: Mushroom, Statlog and Financial. ... We consider two benchmark environments: Lunar Lander-v2 and Catcher-v0 from GYM (Brockman et al., 2016) and Py Game learning environment (Tasfi, 2016). ... We consider image classification task on CIFAR-100 and use standard VGG-16 (Simonyan & Zisserman, 2014) with batch normalization. |
| Dataset Splits | Yes | For Lunar Lander-v2, we train the model for 450 episodes with the first 50 episodes used to initialize the common memory buffer. ... For Catcher-v0, we train the model for 100 episodes with the first 10 episodes used to initialize the common memory buffer. ... Measuring the quality of confidence interval With a large number N of independently generated training data (we use N = 1000), we are able to obtain the corresponding confidence intervals {CI(α)s}N s=1and thus obtain the probability that the true parameter falls into the confidence intervals, which is the estimated coverage probability s=1 I{θ0 CI(α)s}. |
| Hardware Specification | No | The paper mentions 'wall clock time' for training comparisons but does not specify any hardware details like GPU/CPU models, memory, or cloud instances used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'RMSprop optimizer' and 'Adam optimizer' and refers to models like 'VGG-16' and 'DQN', implying the use of machine learning frameworks (e.g., PyTorch, TensorFlow). However, it does not provide specific version numbers for any software libraries or dependencies. |
| Experiment Setup | Yes | For Lunar Lander-v2, we train the model for 450 episodes... The maximum number of steps within each episode is set to 1000... For Catcher-v0, we train the model for 100 episodes... The maximum number of steps within each episodes 2000... At each step, the policy network of all particles are updated using one step gradient descent with Adam optimizer (β = (0.9, 0.999) and learning rate 0.001) and mini-batch data (size 64) sampled from its replay buffer. ... We train the bootstrap model for 160 epochs using SGD optimizer with 0.9 momentum and batchsize 128. The learning rate is initialized to be 0.1 and is decayed by a factor of 10 at epoch 80 and 120. |