Demystifying Information-Theoretic Clustering
Authors: Greg Ver Steeg, Aram Galstyan, Fei Sha, Simon DeDeo
ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate the proposed approach on synthetic data and commonly used datasets for clustering and contrast to existing approaches for information-theoretic clustering. |
| Researcher Affiliation | Academia | 1 Information Sciences Institute, 4676 Admiralty Way, Marina del Rey, CA 90292, USA 2 University of Southern California, Los Angeles, CA 90089, USA 3 Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA 4 School of Informatics and Computing, Indiana University, 901 E 10th St., Bloomington, IN 47408, USA |
| Pseudocode | No | The paper discusses numerical procedures and heuristic approaches in text, but does not include a clearly labeled pseudocode or algorithm block. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code for the described methodology or a link to a code repository. |
| Open Datasets | Yes | UCI datasets We also consider three standard clustering datasets from the UCI Machine Learning database (Bache & Lichman, 2013): glass, iris, and wine. |
| Dataset Splits | No | The paper mentions 'synthetic data' and 'real-world datasets' like UCI datasets but does not explicitly specify exact percentages, sample counts, or methodology for splitting data into training, validation, and test sets. |
| Hardware Specification | No | The paper does not specify any particular hardware components (e.g., CPU, GPU models, memory) used for conducting the experiments. |
| Software Dependencies | No | The paper mentions other methods like 'k-means' and references other works for algorithms (e.g., 'semideļ¬nite optimization based on this criteria (Wang & Sha, 2011)'), and mentions CPLEX in Appendix B, but it does not list specific software libraries or tools with their version numbers that are critical for reproducing the experiments. |
| Experiment Setup | No | The paper describes some parameters for the synthetic data and discusses 'k' for the k-NN estimator, but it does not provide a comprehensive 'Experimental Setup' section detailing hyperparameters, optimization settings, or system-level configurations for the main experiments. |