reproducibilityindex.ai

Hierarchical Clustering via Spreading Metrics

Authors: Aurko Roy, Sebastian Pokutta

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also implement the integer program, its LP relaxation, and the rounding algorithm and test it on some synthetic and real world data sets to compare the cost of the rounded solutions to the true optimum, as well as to compare its performance to other hierarchical clustering algorithms used in practice. Our experiments suggest that the hierarchies found by this algorithm are often better than the ones found by linkage based algorithms as well as the k-means algorithm in terms of the error of the best pruning of the tree compared to the ground truth.
Researcher Affiliation	Academia	Aurko Roy1 and Sebastian Pokutta2 1College of Computing, Georgia Institute of Technology, Atlanta, GA, USA. Email: aurko@gatech.edu 2ISy E, Georgia Institute of Technology, Atlanta, GA, USA. Email: sebastian.pokutta@isye.gatech.edu
Pseudocode	Yes	Algorithm 1: Iterative rounding algorithm to ﬁnd a low cost ultrametric
Open Source Code	No	The paper does not provide an explicit statement or link to open-source code for the described methodology.
Open Datasets	Yes	We considered synthetic data sets and some data sets from the UCI database [36].
Dataset Splits	No	The paper does not provide specific training/validation/test splits, percentages, or details on cross-validation. It only mentions subsampling for larger datasets.
Hardware Specification	No	The paper does not provide any specific hardware details used for running the experiments.
Software Dependencies	No	The paper mentions using the "dual simplex method" but does not specify any software names with version numbers, such as libraries, frameworks, or solvers. For example, it does not state which specific dual simplex solver was used (e.g., CPLEX, Gurobi) or its version.
Experiment Setup	Yes	For the similarity function κ we limited ourselves to using cosine similarity κcos and the Gaussian kernel κgauss with σ = 1. Since Algorithm 1 requires κ 0, in practice we use 1 + κcos instead of κcos.