X-DMM: Fast and Scalable Model Based Text Clustering
Authors: Linwei Li, Liangchen Guo, Zhenying He, Yinan Jing, X. Sean Wang4197-4204
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the performance of X-DMM on several real world datasets, and the experimental results show that XDMM achieves substantial speed up compared with existing state-of-the-art algorithms without clustering accuracy degradation. |
| Researcher Affiliation | Academia | Linwei Li,1 Liangchen Guo,1 Zhenying He,1,2,3 Yinan Jing,1,2,3 X. Sean Wang1,2,3 1School of Computer Science and Technology, Fudan University 2Shanghai Key Lab of Data Science 3Shanghai Institute of Intelligent Electronics & Systems, China |
| Pseudocode | Yes | Algorithm 1 The GSDMM algorithm (...) Algorithm 2 The Metropolis-Hastings algorithm (...) Algorithm 3 Parallel training of DMM |
| Open Source Code | No | The paper does not provide an explicit statement about the release of its source code or a link to a code repository for X-DMM. |
| Open Datasets | Yes | We use four real world datasets, 20ng1, QA2, and ohsumed3, and Reuters4. (...) NYTimes article dataset5 |
| Dataset Splits | No | The paper mentions using a 'validation' step in the algorithm description but does not provide specific details on how validation sets were created or used for hyperparameter tuning in the experimental setup (e.g., specific percentages or sample counts for train/validation/test splits). |
| Hardware Specification | Yes | The experiments are conducted on a PC with Intel CPU i57400 and Nvidia GPU GTX-1080. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies (e.g., programming languages, libraries, or frameworks). |
| Experiment Setup | Yes | Similar to (Yin and Wang 2014), we set α = 0.1 and β = 0.1 for GSDMM. (...) Similar to (Griffiths and Steyvers 2004), we set α = 50/K and β = 0.1. (...) The cluster numbers are set as K listed in Table 4. |