On hyperparameter tuning in general clustering problemsm
Authors: Xinjie Fan, Yuguang Yue, Purnamrita Sarkar, Y. X. Rachel Wang
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In a variety of simulation and real data experiments, we show that our framework outperforms other widely used tuning procedures in a broad range of parameter settings. (...) Finally, Section 5 contains detailed simulated and real data experiments |
| Researcher Affiliation | Academia | 1Department of Statistics and Data Sciences, University of Texas at Austin 2School of Mathematics and Statistics, University of Sydney. |
| Pseudocode | Yes | Algorithm 1 MAx-TRace (MATR) for known r. (...) Algorithm 2 MATR-CV. (...) Algorithm 3 Splitting (...) Algorithm 4 Cluster Test |
| Open Source Code | No | The paper does not provide any explicit statement or link for open-source code for the described methodology. |
| Open Datasets | Yes | We compare MATR with ECV and CL on the football (Girvan & Newman, 2002), political books and the political blogs (Adamic & Glance, 2005) datasets. (...) test set provided by (Pedregosa et al., 2011) of the Optical Recognition of Handwritten Digits Data Set (...) Avila dataset 1https://archive.ics.uci.edu/ml/datasets/Avila |
| Dataset Splits | Yes | Algorithm 2 MATR-CV. (...) training ratio γtrain, trace gap for j = 1 : J do (...) We use a training ratio of 0.9 and the L2 loss throughout. |
| Hardware Specification | Yes | MATR-CV takes around 2 hours to complete while SIL takes around 7 hours and GAP takes around 30 hours to finish on an single node of two Xeon E5-2690 v3 with 24 cores. |
| Software Dependencies | No | The paper does not specify version numbers for any software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | Since λ [0, 1] for SDP-1, we choose λ {0, , 20}/20 in all the examples. (...) our candidate set of θ is {tα/20} for t = 1, , 20 and α = maxi,j Yi Yj 2. (...) We vary c from 0 to 200. (...) For all methods, we set the maximal number of clusters to be square root of the dataset size. |