Fast Conditional Mixing of MCMC Algorithms for Non-log-concave Distributions
Authors: Xiang Cheng, Bohan Wang, Jingzhao Zhang, Yusong Zhu
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we conduct experiments to verify the theoretical results and compare global mixing versus conditional mixing for Gaussian mixture models. We take three Gaussian mixtures: ν1 = 0.9N1( 10, 1) + 0.1N1(10, 1), ν2 = 0.15N1( 5, 1) + 0.15N1( 2.5, 1) + 0.3N1(0, 1) + 0.2N1(2.5, 1) + 0.2N1(5, 1), and ν3 = 0.4N2(( 5, 5), I2) + 0.4N2((5, 5), I2) + 0.1N2(( 5, 5), I2) + 0.1N2((5, 5), I2) as our target distributions. We use Algorithm 1 as our sampling algorithm, and set step size h = 10 2. The initial distributions are both uniform in a large enough range. We plot the sampling distribution after T = 500, 5000, 500 rounds respectively in Figure 1a, 1b, and 1c, and plot the conditional and global KL divergence in Figure 1d, 1e, and 1f. |
| Researcher Affiliation | Academia | Xiang Cheng MIT x.cheng@berkeley.edu Bohan Wang USTC bhwangfy@gmail.com Jingzhao Zhang IIIS, Tsinghua; Shanghai Qizhi Institute jingzhaoz@mail.tsinghua.edu.cn Yusong Zhu Tsinghua University zhuys19@mails.tsinghua.edu.cn |
| Pseudocode | Yes | Algorithm 1 Langevin Monte Carlo Input: Initial parameter z, potential function V , step size h, number of iteration T 1: Initialization z0 z 2: For t = 0 T: 3: Generate Gaussian random vector ξt N(0, Id) 4: Update z(t+1)h zth h V (zth) + 2hξt 5: End For |
| Open Source Code | No | The paper does not provide an explicit statement or link to open-source code for the described methodology. |
| Open Datasets | No | The paper uses constructed target distributions (mixtures of Gaussians) for its experiments, rather than external publicly available datasets. For instance, in Section 6.1, it states: 'We take three Gaussian mixtures: ν1 = 0.9N1( 10, 1) + 0.1N1(10, 1), ν2 = 0.15N1( 5, 1) + 0.15N1( 2.5, 1) + 0.3N1(0, 1) + 0.2N1(2.5, 1) + 0.2N1(5, 1), and ν3 = 0.4N2(( 5, 5), I2) + 0.4N2((5, 5), I2) + 0.1N2(( 5, 5), I2) + 0.1N2((5, 5), I2) as our target distributions.' |
| Dataset Splits | No | The paper describes experiments involving sampling from target distributions rather than training models on datasets with explicit train/validation/test splits. Therefore, it does not specify dataset splits for validation. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as CPU or GPU models, or memory specifications. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies, libraries, or programming languages used in the experiments. |
| Experiment Setup | Yes | We use Algorithm 1 as our sampling algorithm, and set step size h = 10 2. The initial distributions are both uniform in a large enough range. We plot the sampling distribution after T = 500, 5000, 500 rounds respectively in Figure 1a, 1b, and 1c, and plot the conditional and global KL divergence in Figure 1d, 1e, and 1f. |