Hierarchical Clustering via Spreading Metrics
Authors: Aurko Roy, Sebastian Pokutta
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also implement the integer program, its LP relaxation, and the rounding algorithm and test it on some synthetic and real world data sets to compare the cost of the rounded solutions to the true optimum, as well as to compare its performance to other hierarchical clustering algorithms used in practice. Our experiments suggest that the hierarchies found by this algorithm are often better than the ones found by linkage based algorithms as well as the k-means algorithm in terms of the error of the best pruning of the tree compared to the ground truth. |
| Researcher Affiliation | Academia | Aurko Roy1 and Sebastian Pokutta2 1College of Computing, Georgia Institute of Technology, Atlanta, GA, USA. Email: aurko@gatech.edu 2ISy E, Georgia Institute of Technology, Atlanta, GA, USA. Email: sebastian.pokutta@isye.gatech.edu |
| Pseudocode | Yes | Algorithm 1: Iterative rounding algorithm to find a low cost ultrametric |
| Open Source Code | No | The paper does not provide an explicit statement or link to open-source code for the described methodology. |
| Open Datasets | Yes | We considered synthetic data sets and some data sets from the UCI database [36]. |
| Dataset Splits | No | The paper does not provide specific training/validation/test splits, percentages, or details on cross-validation. It only mentions subsampling for larger datasets. |
| Hardware Specification | No | The paper does not provide any specific hardware details used for running the experiments. |
| Software Dependencies | No | The paper mentions using the "dual simplex method" but does not specify any software names with version numbers, such as libraries, frameworks, or solvers. For example, it does not state which specific dual simplex solver was used (e.g., CPLEX, Gurobi) or its version. |
| Experiment Setup | Yes | For the similarity function κ we limited ourselves to using cosine similarity κcos and the Gaussian kernel κgauss with σ = 1. Since Algorithm 1 requires κ 0, in practice we use 1 + κcos instead of κcos. |