Function Space Particle Optimization for Bayesian Neural Networks
Authors: Ziyu Wang, Tongzheng Ren, Jun Zhu, Bo Zhang
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate through extensive experiments that our method successfully overcomes this issue, and outperforms strong baselines in a variety of tasks including prediction, defense against adversarial examples, and reinforcement learning. (Abstract) and In this section, we evaluate our method on a variety of tasks. (Section 5, Evaluation) |
| Researcher Affiliation | Academia | Ziyu Wang, Tongzheng Ren, Jun Zhu , Bo Zhang Department of Computer Science & Technology, Institute for Artificial Intelligence, State Key Lab for Intell. Tech. & Sys., BNRist Center, THBI Lab, Tsinghua University {wzy196,rtz19970824}@gmail.com, {dcszj,dcszb}@tsinghua.edu.cn |
| Pseudocode | Yes | Algorithm 1 Function Space POVI for Bayesian Neural Network |
| Open Source Code | Yes | Code for the experiments will be available at https://github.com/thu-ml/fpovi. |
| Open Datasets | Yes | We evaluate the predictive performance on several standard regression and classification datasets: a number of UCI datasets for real-valued regression, and the MNIST dataset for classification. (Section 5.2) and We use the mushroom and wheel bandits from Riquelme et al. (2018). (Section 5.4) |
| Dataset Splits | Yes | We use a 90-10 random traintest split, repeated for 20 times, except for Protein in which we use 5 replicas. (Appendix A.2.1) and We hold out the last 10,000 examples in training set for model selection. (Section 5.2.2, MNIST) |
| Hardware Specification | No | No specific hardware details (like CPU/GPU models, memory) are provided. |
| Software Dependencies | No | The implementation is based on Zhu Suan (Shi et al., 2017). (Section 5). While “Zhu Suan” is a software name, no version number is given for it, nor for any other libraries like Python, PyTorch/TensorFlow, etc. |
| Experiment Setup | Yes | For our method in all datasets, we use the Ada M optimizer with a learning rate of 0.004. For datasets with fewer than 1000 samples, we use a batch size of 100 and train for 500 epochs. For the larger datasets, we set the batch size to 1000, and train for 1000 epochs. (Appendix A.2.1) |