Lower Bounds for the Gibbs Sampler over Mixtures of Gaussians

Authors: Christopher Tosh, Sanjoy Dasgupta

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section 7 we experimentally verify that this is the case by showing that most runs of the Gibbs sampler spend the majority of their time in local optima with relatively low probability mass.
Researcher Affiliation Academia Christopher Tosh CTOSH@CS.UCSD.EDU Sanjoy Dasgupta DASGUPTA@CS.UCSD.EDU Department of Computer Science and Engineering, UCSD, 9500 Gilman Drive, La Jolla, CA 92093-0404
Pseudocode Yes Algorithm 1 The collapsed Gibbs sampler, P. Algorithm 2 The projected Gibbs sampler, P[.
Open Source Code No The paper does not provide any statement about making its source code publicly available or provide a link to a code repository.
Open Datasets No For each experiment, we generated the point sequence by taking k = 10 draws from a d-dimensional spherical Gaussian N(0, σ20Id) to get means µ1, . . . , µ10. For each mean µi, we took n = 50 draws from N(µi, σ2Id) with σ = 0.5.
Dataset Splits No The paper does not describe dataset splits for training, validation, or testing in the traditional machine learning sense. The experiments involve simulating a Gibbs sampler on generated data points, not splitting a dataset for model training.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as CPU/GPU models or memory.
Software Dependencies No The paper does not specify any software dependencies with version numbers needed to replicate the experiments.
Experiment Setup Yes For each experiment, we generated the point sequence by taking k = 10 draws from a d-dimensional spherical Gaussian N(0, σ20Id) to get means µ1, . . . , µ10. For each mean µi, we took n = 50 draws from N(µi, σ2Id) with σ = 0.5. Each run of the Gibbs sampler was done for 1,000,000 steps, and we plotted at each step the log of the relative probability of the current state C. To generate our initial configurations, we randomly chose k centers and clustered the points together that were closest to a particular center.