Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Unidimensional Clustering of Discrete Data Using Latent Tree Models
Authors: April Liu, Leonard Poon, Nevin Zhang
AAAI 2015 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive empirical studies have been conducted to compare the new method with LCM and several other methods (K-means, kernel Kmeans and spectral clustering) that are not model-based. |
| Researcher Affiliation | Academia | 1 Department of Computer Science and Engineering The Hong Kong University of Science and Technology, Hong Kong EMAIL 2 Department of Mathematics and Information Technology The Hong Kong Institute of Education, Hong Kong EMAIL |
| Pseudocode | Yes | Algorithm 1 shows the pseudo-code for our algorithm. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository for the described methodology. |
| Open Datasets | Yes | The real-world data sets were from the UCI machine learning repository. |
| Dataset Splits | No | The paper describes a process for learning LCMs where cardinality is gradually increased and parameters re-estimated until the model score ceases to increase (guided by AIC/BIC). While this acts as a form of model selection/validation, it does not specify explicit train/validation dataset splits with percentages or counts. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions methods like EM algorithm and algorithms from other papers but does not specify software names with version numbers for dependencies (e.g., Python, PyTorch, scikit-learn versions). |
| Experiment Setup | Yes | In our experiments, the threshold δ is set at 3 as suggested by Kass and Raftery (1995)... To do so, we initially set the cardinality of Y1 at 2 and optimized the probability parameters using the EM algorithm... Then the cardinality is gradually increased and the parameters are re-estimated after each increase. The process stops when model score ceases to increase. |