Global Optimization of K-Center Clustering
Authors: Mingfei Shi, Kaixun Hua, Jiayang Ren, Yankai Cao
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This work provides a practical global optimization algorithm for this task based on a reduced-space spatial branch and bound scheme. This algorithm can guarantee convergence to the global optimum by only branching on the centers of clusters, which is independent of the dataset s cardinality. In addition, a set of feasibility-based bounds tightening techniques are proposed to narrow down the domain of centers and significantly accelerate the convergence. To demonstrate the capacity of this algorithm, we present computational results on 32 datasets. |
| Researcher Affiliation | Academia | 1Department of Chemical and Biological Engineering, University of British Columbia, Vancouver, British Columbia, Canada. |
| Pseudocode | Yes | Algorithm 1 Farthest First Traversal; Algorithm 2 Branch and Bound Scheme |
| Open Source Code | Yes | We provide an open-source Julia implementation of the algorithm and perform extensive computational experiments on 32 datasets... All experiments are implemented in Julia 1.6.1 and the complete code files can be found in https://github.com/Yankai Group/Kcenter. |
| Open Datasets | Yes | We test the performance of our algorithm and CPLEX on the Niagara Compute Canada with a time limit of 4 hours... We first consider numerical results on artificially generated datasets from (Hua et al., 2021)... We then evaluate the performance of our algorithms on 30 datasets from UCI Machine Learning Repository (Dua & Graff, 2017), one dataset called PR2392 from (Padberg & Rinaldi, 1991), and one dataset called HEMI from (Wang et al., 2022). |
| Dataset Splits | No | The paper focuses on global optimization and does not specify traditional training/validation/test splits for model evaluation or reproduction. It discusses optimality gap and convergence to a solution for the entire dataset. |
| Hardware Specification | No | The experiments were run on "the Niagara Compute Canada". While this indicates a high-performance computing environment, it does not provide specific hardware details like CPU models, GPU models, or memory specifications. |
| Software Dependencies | Yes | All experiments are implemented in Julia 1.6.1... We compare the numerical results of our algorithm with the state-of-art global optimizer CPLEX 20.1.0 (Cplex, 2020) and the heuristic algorithm, Farthest First Traversal... Gurobi (Optimization, 2020). |
| Experiment Setup | Yes | In all the experiments, the number of Cluster K is set to 3. All experiments are implemented in Julia 1.6.1 and the complete code files can be found in https://github.com/Yankai Group/Kcenter. We test the performance of our algorithm and CPLEX on the Niagara Compute Canada with a time limit of 4 hours. In our implementation, we use two methods to obtain candidate solutions. At the root node, we use a heuristic method called Farthest First Traversal to obtain a candidate solution... In our implementation, we set a threshold on the maximum number of balls (default: 50) utilized to tighten bounds. |