Global Optimization of K-Center Clustering

Authors: Mingfei Shi, Kaixun Hua, Jiayang Ren, Yankai Cao

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This work provides a practical global optimization algorithm for this task based on a reduced-space spatial branch and bound scheme. This algorithm can guarantee convergence to the global optimum by only branching on the centers of clusters, which is independent of the dataset s cardinality. In addition, a set of feasibility-based bounds tightening techniques are proposed to narrow down the domain of centers and significantly accelerate the convergence. To demonstrate the capacity of this algorithm, we present computational results on 32 datasets.
Researcher Affiliation Academia 1Department of Chemical and Biological Engineering, University of British Columbia, Vancouver, British Columbia, Canada.
Pseudocode Yes Algorithm 1 Farthest First Traversal; Algorithm 2 Branch and Bound Scheme
Open Source Code Yes We provide an open-source Julia implementation of the algorithm and perform extensive computational experiments on 32 datasets... All experiments are implemented in Julia 1.6.1 and the complete code files can be found in https://github.com/Yankai Group/Kcenter.
Open Datasets Yes We test the performance of our algorithm and CPLEX on the Niagara Compute Canada with a time limit of 4 hours... We first consider numerical results on artificially generated datasets from (Hua et al., 2021)... We then evaluate the performance of our algorithms on 30 datasets from UCI Machine Learning Repository (Dua & Graff, 2017), one dataset called PR2392 from (Padberg & Rinaldi, 1991), and one dataset called HEMI from (Wang et al., 2022).
Dataset Splits No The paper focuses on global optimization and does not specify traditional training/validation/test splits for model evaluation or reproduction. It discusses optimality gap and convergence to a solution for the entire dataset.
Hardware Specification No The experiments were run on "the Niagara Compute Canada". While this indicates a high-performance computing environment, it does not provide specific hardware details like CPU models, GPU models, or memory specifications.
Software Dependencies Yes All experiments are implemented in Julia 1.6.1... We compare the numerical results of our algorithm with the state-of-art global optimizer CPLEX 20.1.0 (Cplex, 2020) and the heuristic algorithm, Farthest First Traversal... Gurobi (Optimization, 2020).
Experiment Setup Yes In all the experiments, the number of Cluster K is set to 3. All experiments are implemented in Julia 1.6.1 and the complete code files can be found in https://github.com/Yankai Group/Kcenter. We test the performance of our algorithm and CPLEX on the Niagara Compute Canada with a time limit of 4 hours. In our implementation, we use two methods to obtain candidate solutions. At the root node, we use a heuristic method called Farthest First Traversal to obtain a candidate solution... In our implementation, we set a threshold on the maximum number of balls (default: 50) utilized to tighten bounds.