The Infinite Mixture of Infinite Gaussian Mixtures
Authors: Halid Z Yerebakan, Bartek Rajwa, Murat Dundar
NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on several artificial and real-world data sets suggest the proposed I2GMM model can predict clusters more accurately than existing variational Bayes and Gibbs sampler versions of DPMG. |
| Researcher Affiliation | Academia | Halid Z. Yerebakan Department of Computer and Information Science IUPUI Indianapolis, IN 46202 hzyereba@cs.iupui.edu Bartek Rajwa Bindley Bioscience Center Purdue University W. Lafayette, IN 47907 rajwa@cyto.purdue.edu Murat Dundar Department of Computer and Information Science IUPUI Indianapolis, IN 46202 dundar@cs.iupui.edu |
| Pseudocode | No | The paper describes the generative model and inference steps using mathematical equations, but does not provide structured pseudocode or an algorithm block. |
| Open Source Code | Yes | I2GMM is implemented in C++. The source files and executables are available on the web. 2https://github.com/halidziya/I2GMM |
| Open Datasets | Yes | Lymphoma: Lymphoma data set is one of the data sets used in the Flow CAP (Flow Cytometry Critical Assessment of Population Identification Methods) 2010 competition [1]. |
| Dataset Splits | No | The paper refers to training for MCMC sampling (“run for 1500 sweeps”, “1000 samples are ignored as burn-in”) but does not provide specific train/validation/test dataset splits for reproducibility. |
| Hardware Specification | Yes | The largest gain by parallelization is obtained on the rare classes data set which offered almost two-fold increase by parallelization on an eight-core workstation. |
| Software Dependencies | No | The paper mentions “C++” and “MATLAB R” for implementations but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | We use vague priors with α and γ by fixing their value to one. We set m to the minimum feasible value, which is d+2... We use s = 150/(d(logd)), κ0 = 0.05, and κ1 = 0.5 in experiments with all five data sets described above. |