Scalable Estimation of Dirichlet Process Mixture Models on Distributed Data
Authors: Ruohui Wang, Dahua Lin
IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on large real-world data sets show that the proposed method can achieve high scalability in distributed and asynchronous environments without compromising the mixing performance. |
| Researcher Affiliation | Academia | Ruohui Wang Department of Information Engineering, The Chinese University of Hong Kong wr013@ie.cuhk.edu.hk Dahua Lin Department of Information Engineering, The Chinese University of Hong Kong dhlin@ie.cuhk.edu.hk |
| Pseudocode | Yes | Algorithm 1 Progressive Consolidation and Algorithm 2 Restricted Consolidation |
| Open Source Code | No | The paper does not provide any statement about releasing source code, nor does it include links to a code repository. |
| Open Datasets | Yes | The Image Net dataset is constructed from the training set of ILSVRC [Russakovsky et al., 2015]... For the New York Time (NYT) Corpus [Sandhaus, 2008]... |
| Dataset Splits | No | The paper does not provide specific details on training, validation, and test splits (e.g., percentages or counts). It mentions using the 'training set' for ImageNet and 'provided groundtruths' for evaluation but lacks explicit split information. |
| Hardware Specification | No | We conducted the experiments using up to 30 workers on multiple physical servers. They can communicate with each other via Gigabit Ethernet or TCP loop-back interfaces. |
| Software Dependencies | No | The paper describes various algorithms and methods but does not provide specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, scikit-learn X.Y). |
| Experiment Setup | Yes | We formulate a Gaussian mixture to describe the feature samples, where the covariance of each Gaussian components is fixed to σ2I with σ = 8. We use N(0, σ2 0I) as the prior distribution over the mean parameters of these components, where σ0 = 8. For the New York Time (NYT) Corpus [Sandhaus, 2008], we construct a vocabulary with 9866 distinct words, and derive a bag-of-word representation for each article. Removing those with less than 20 words, we obtain a data set with about 1.7M articles. We use a mixture of multinomial distribution to describe the NYT corpus. The prior here is a symmetric Dirichlet distribution with hyperparameter γ = 1. |