On the price of explainability for some clustering problems
Authors: Eduardo S Laber, Lucas Murtinho
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Another contribution is a simple and efficient algorithm for building explainable clusterings for the k-means problem. We provide empirical evidence that its performance is better than the current state of the art for decision-tree based explainable clustering. |
| Researcher Affiliation | Academia | 1Department of Computer Science, PUC-Rio, Brazil. |
| Pseudocode | Yes | Algorithm 1 Ex-k Center( X : set of points) ... Algorithm 2 Build Tree(X S ) ... Algorithm 3 Ex-Single Link(X) |
| Open Source Code | Yes | Our code is availble in https://github.com/lmurtinho/ExKMC. |
| Open Datasets | Yes | The datasets Iris, Wine, Breast Cancer, Digits, Covtype, Mice and Newsgroup are available in Python s scikit-learn; Cifar-10 is available in Tensor Flow; Anuran and Avila were downloaded from UCI. |
| Dataset Splits | No | The paper does not provide explicit training, validation, or test dataset splits (e.g., percentages or sample counts). It mentions using datasets and running the KMeans algorithm with default parameters for an initial unrestricted solution. |
| Hardware Specification | Yes | All our experiments were executed in a Mac Book Air, 8Gb of RAM, processor 1,6 GHz Dual Core Intel Core i5, executing mac OS Catalina, version 10.15.7. |
| Software Dependencies | No | The paper mentions software like Python's scikit-learn and TensorFlow, but does not specify their version numbers for reproducibility. |
| Experiment Setup | Yes | For each iteration, we initially achieve an unrestricted solution Cini by running the KMeans algorithm provided in the scikit-klearn package with default parameters. |