An Efficient Semismooth Newton based Algorithm for Convex Clustering
Authors: Yancheng Yuan, Defeng Sun, Kim-Chuan Toh
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive numerical experiments on both simulated and real data demonstrate that our algorithm is highly efficient and robust for solving large-scale problems. |
| Researcher Affiliation | Academia | 1Department of Mathematics, National University of Singapore 2Department of Applied Mathematics, Hong Kong Polytechnic University 3Department of Mathematics, National University of Singapore. |
| Pseudocode | Yes | Algorithm 1 SSNAL for (P), Algorithm 2 SSNCG for (9), Algorithm 3 IADMM for (P) |
| Open Source Code | No | The paper mentions using open source software CVXCLUSTR, but does not provide concrete access to their own implementation code (written in MATLAB). |
| Open Datasets | Yes | MNIST, Fisher Iris, WINE, Yale Face B(10Train subset)., Fisher. Fisher iris dataset, 1936. UCI Machine Learning Repository https://archive.ics.uci.edu/ml/datasets/iris. |
| Dataset Splits | No | The paper discusses solving the convex clustering model for a range of gamma values to generate a clustering path and evaluates performance, but does not specify train/validation/test splits or cross-validation methodology for data partitioning. |
| Hardware Specification | Yes | All our computational results are obtained from a desktop having 16 cores with 32 Intel Xeon E5-2650 processors at 2.6 GHz and 64 GB memory. |
| Software Dependencies | No | The paper states, 'We write our code in MATLAB without any dedicated C functions.' and mentions using 'CVXCLUSTR' which is 'an R package', but it does not provide specific version numbers for MATLAB, R, or any other software dependencies. |
| Experiment Setup | Yes | In the experiments, we choose k = 10, φ = 0.5 (for the weights wij) and γ [0.2 : 0.2 : 10] to generate the clustering path. In our experiments, we set ϵ = 10 6 unless specified otherwise. When we generate the clustering path for the first parameter value of γ, we first run the IADMM introduced in Algorithm 3 for 100 iterations to generate an initial point, then we use SSNAL to solve (2). |