A Framework for Minimal Clustering Modification via Constraint Programming
Authors: Chia-Tung Kuo, S. Ravi, Thi-Bich-Hanh Dao, Christel Vrain, Ian Davidson1389
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically validate its usefulness through experiments on social network and medical imaging data sets. |
| Researcher Affiliation | Academia | Chia-Tung Kuo University of California, Davis tomkuo@ucdavis.edu S. S. Ravi University at Albany sravi@albany.edu Thi-Bich-Hanh Dao University of Orleans thi-bich-hanh.dao@univ-orleans.fr Christel Vrain University of Orleans Christel.Vrain@univ-orleans.fr Ian Davidson University of California, Davis davidson@cs.ucdavis.edu |
| Pseudocode | Yes | Figure 2: CP optimization encoding where the user provides a set of desired (feature-wise) diameters D as feedback. |
| Open Source Code | Yes | We provide enough details to reproduce our results and our code is made available2. 2https://sites.google.com/site/chiatungkuo/publication |
| Open Datasets | Yes | We apply our proposed approach to a network data set: Facebook-egonets from Stanford SNAP Data sets (Leskovec and Krevl 2014). |
| Dataset Splits | No | The paper mentions running k-means multiple times and selecting the best result, but does not provide specific train/validation/test split percentages, sample counts, or a detailed splitting methodology for reproducibility. |
| Hardware Specification | No | Consequently our experiments on the Facebook data (n = 4039, k = 4, f = 2) and f MRI data (n = 1730, k = 4, f = 2) each take less than 2 minutes to finish on a 12-core workstation. |
| Software Dependencies | No | Note that we chose to implement our model in the CP language Numberjack (Hebrard, O Mahony, and O Sullivan 2010) due to its simple interface and its use of state-of-the-art integer linear program (ILP) solvers. ILP solvers such as Gurobi (Inc. 2015) (used in our experiments) can easily exploit multi-core architectures. |
| Experiment Setup | Yes | We choose the upper and lower bounds according to the averages in the initial summary and set bounds [0.36, 0.4] for gender and [0.13, 0.15] for language so that these two features are balanced across clusters. |