XClusters: Explainability-First Clustering
Authors: Hyunseung Hwang, Steven Euijong Whang
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that our method can improve the explainability of any clustering that fits in our framework. |
| Researcher Affiliation | Academia | Hyunseung Hwang, Steven Euijong Whang KAIST {aguno, swhang}@kaist.ac.kr |
| Pseudocode | Yes | Algorithm 1: XClusters algorithm |
| Open Source Code | No | The paper does not provide an explicit statement or link for the open-sourcing of the methodology's code. |
| Open Datasets | Yes | DS4C (Kim 2020): a public COVID-19 dataset containing patient data, policy data, and provincial data released by the Korea Centers for Disease Control & Prevention (KCDC). We use floating population data of the city of Seoul for each age and gender group (Jan. 2020 May 2020). Contracts (Linville 2022): a public contract dataset maintained by the State of Washington. |
| Dataset Splits | No | The paper mentions searching k within a range and repeating experiments, but it does not provide specific training, validation, and test dataset splits for reproducibility. |
| Hardware Specification | Yes | All experiments are performed on a server with Intel Xeon Gold 5115 CPUs. |
| Software Dependencies | No | The paper mentions using “Scikit-learn (Pedregosa et al. 2011) for the decision tree training” but does not specify its version number or any other software dependencies with their versions. |
| Experiment Setup | Yes | We search k within the range [3, 4, . . . , 11] for all datasets. For XClusters, we set λ = 1, and ϵb = 0.05 as default values. We repeat each experiment 10 times. |