XClusters: Explainability-First Clustering

Authors: Hyunseung Hwang, Steven Euijong Whang

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that our method can improve the explainability of any clustering that fits in our framework.
Researcher Affiliation Academia Hyunseung Hwang, Steven Euijong Whang KAIST {aguno, swhang}@kaist.ac.kr
Pseudocode Yes Algorithm 1: XClusters algorithm
Open Source Code No The paper does not provide an explicit statement or link for the open-sourcing of the methodology's code.
Open Datasets Yes DS4C (Kim 2020): a public COVID-19 dataset containing patient data, policy data, and provincial data released by the Korea Centers for Disease Control & Prevention (KCDC). We use floating population data of the city of Seoul for each age and gender group (Jan. 2020 May 2020). Contracts (Linville 2022): a public contract dataset maintained by the State of Washington.
Dataset Splits No The paper mentions searching k within a range and repeating experiments, but it does not provide specific training, validation, and test dataset splits for reproducibility.
Hardware Specification Yes All experiments are performed on a server with Intel Xeon Gold 5115 CPUs.
Software Dependencies No The paper mentions using “Scikit-learn (Pedregosa et al. 2011) for the decision tree training” but does not specify its version number or any other software dependencies with their versions.
Experiment Setup Yes We search k within the range [3, 4, . . . , 11] for all datasets. For XClusters, we set λ = 1, and ϵb = 0.05 as default values. We repeat each experiment 10 times.