From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical Clustering

Authors: Ines Chami, Albert Gu, Vaggos Chatziafratis, Christopher Ré

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experimentally evaluate HYPHC on a variety of HC benchmarks and find that even approximate solutions found with gradient descent have superior clustering quality than agglomerative heuristics or other gradient based algorithms.
Researcher Affiliation Collaboration Department of Computer Science, Stanford University Institute for Computational and Mathematical Engineering, Stanford University Google Research, NY {chami, albertgu, vaggos, chrismre}@cs.stanford.edu
Pseudocode Yes Algorithm 1 Hyperbolic binary tree decoding dec(Z)
Open Source Code Yes We implemented Hyp HC in Py Torch and make our implementation publicly available.6 https://github.com/Hazy Research/Hyp HC
Open Datasets Yes We measure the clustering quality of HYPHC on six standard datasets from the UCI Machine Learning repository,3 as well as CIFAR-100 [35]
Dataset Splits Yes We consider four of the HC datasets that come with categorical labels for leaf nodes, split into training, testing and validation sets (30/60/10% splits).
Hardware Specification Yes We conducted our experiments on a single NVIDIA Tesla P100 GPU.
Software Dependencies No The paper mentions using 'Py Torch' and 'geoopt [32]' but does not provide specific version numbers for these software dependencies.
Experiment Setup Yes We train HYPHC for 50 epochs (of the sampled triples) and optimize embeddings with Riemannian Adam [7]. We set the embedding dimension to two in all experiments... We perform a hyper-parameter search over learning rate values [1e 3, 5e 4, 1e 4] and temperature values [1e 1, 5e 2, 1e 2].