Entropy-based Training Methods for Scalable Neural Implicit Samplers

Authors: Weijian Luo, Boya Zhang, Zhihua Zhang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To demonstrate the effectiveness, efficiency, and scalability of our proposed samplers, we evaluate them on three sampling benchmarks with different scales.
Researcher Affiliation Academia Weijian Luo Boya Zhang Zhihua Zhang School of Mathematical Sciences; Peking University; luoweijian@stu.pku.edu.cn; Academy for Advanced Interdisciplinary Studies; Peking University; zhangboya@pku.edu.cn; School of Mathematical Sciences; Peking University; zhzhang@math.pku.edu.cn;
Pseudocode Yes We formally give a unified Algorithm for the training of implicit samplers with KL, Fisher, and Combine training in Algorithm 1.
Open Source Code No The paper references several open-source implementations and packages used (e.g., for baselines or metrics) such as 'https://github.com/louissharrock/coin-svgd' and 'https://github.com/jeremiecoullon/SGMCMCJax', but does not provide a specific statement or link for the open-sourcing of the authors' own method's code.
Open Datasets Yes The Covertype data set [11] has 54 features and 581,012 observations. It has been widely used as a benchmark for Bayesian inference.
Dataset Splits Yes The data set is randomly split into the training set (80%) and the testing set (20%).
Hardware Specification Yes The 2D experiment is conducted on an 8-CPU cluster with PyTorch of 1.8.1, while the EBM experiment is on 1 Nvidia Titan RTX GPU with PyTorch 1.8.1.
Software Dependencies Yes The 2D experiment is conducted on an 8-CPU cluster with PyTorch of 1.8.1, while the EBM experiment is on 1 Nvidia Titan RTX GPU with PyTorch 1.8.1.
Experiment Setup Yes For all MCMC samplers, we set the number of iterations to 500... For SVGD and LD, we set the sampling step size to 0.01. For the HMC sampler, we optimize and find the step size to be 0.1, and Leap Frog updates to 10 work the best... For all targets, we train each neural sampler with the Adam optimizer with the same learning rate of 2e-5 and default bete. We use the same batch size of 5000 for 10k iterations when training all neural samplers.