A Scalable Deterministic Global Optimization Algorithm for Clustering Problems
Authors: Kaixun Hua, Mingfei Shi, Yankai Cao
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We performed numerical experiments on both synthetic and real-world datasets and compared our proposed algorithms with the off-the-shelf global optimal solvers and classical local optimal algorithms. The results reveal a strong performance and scalability of our algorithm. |
| Researcher Affiliation | Academia | 1Department of Chemical and Biological Engineering, University of British Columbia, Vancouver, British Columbia, Canada. |
| Pseudocode | Yes | Algorithm 1 Branch-and-Bound Clustering Scheme |
| Open Source Code | Yes | We present an open-source implementation of the proposed algorithm in Julia. ... The complete code files can be found in https://github. com/kingsley1989/global_kmeans. |
| Open Datasets | Yes | The Iris dataset (Fisher, 1936) has 150 samples and 4 attributes, while the Seeds dataset (Charytanowicz et al., 2010) contains 210 samples and 7 attributes. ... The Padberg and Rinald s hole drilling dataset (Padberg & Rinaldi, 1991) ... The glass identification dataset (Dua & Graff, 2017) |
| Dataset Splits | No | The paper does not explicitly provide training/test/validation dataset splits, as it focuses on unsupervised clustering problems which typically operate on the full dataset to find optimal clusters. |
| Hardware Specification | Yes | All serial experiments are executed on a 2.3GHz quad-core 10th-generation Intel Core i7 processor with 16G RAM. The parallel experiments are executed on the Niagara Cluster of Compute Canada. |
| Software Dependencies | Yes | Our algorithm is implemented in Julia 1.5.3. ... compared with the off-the-shelf global optimal solvers and classical local optimal algorithms. The result of k-means clustering algorithm is generated by repeating with 100 trials using random initialization. ... classic BB algorithm implemented in the state-of-the-art global optimizer CPLEX 20.1.0 (Cplex, 2020), and classic k-means clustering algorithm implemented in Julia Package Clustering. |
| Experiment Setup | Yes | The result of k-means clustering algorithm is generated by repeating with 100 trials using random initialization. The worst, average and best k-means objectives are recorded. Both our algorithm and CPLEX terminate with a time limit of 4 hours. ... For sample grouping, we limited each group s size under min(162/d k, 10 k) when decomposing the lower bounding problem. ... All datasets are generated with 3 Gaussian clusters randomly with seed = 1. ... For each dataset, we solve two clustering problems (k = 3 and k = 4). |