reproducibilityindex.ai

Hierarchical Agglomerative Graph Clustering in Poly-Logarithmic Depth

Authors: Laxman Dhulipala, David Eisenstat, Jakub Lacki, Vahab Mirrokni, Jessica Shi

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We complement our theoretical results with a comprehensive study of the Par HAC algorithm in terms of its scalability, performance, and quality, and compare with several state-of-the-art sequential and parallel baselines.
Researcher Affiliation	Collaboration	Laxman Dhulipala University of Maryland and Google Research laxman@umd.edu David Eisenstat Google Research eisen@google.com Jakub Ł acki Google Research jlacki@google.com Vahab Mirrokni Google Research mirrokni@google.com Jessica Shi MIT CSAIL jeshi@mit.edu
Pseudocode	Yes	Algorithm 1 Par HAC-Contract Layer(G = (V, E, w), TL, ϵ, D)
Open Source Code	Yes	We have made our implementation publicly available on Git Hub.5 https://github.com/Par Alg/Par HAC
Open Datasets	Yes	We evaluate our algorithms on the iris, wine, digits, and cancer, and faces classiﬁcation datasets from the UCI dataset repository (found in the sklearn.datasets package).
Dataset Splits	No	The paper discusses the datasets used and parameters for graph construction (k-NN), but it does not specify explicit training, validation, or test dataset splits (e.g., percentages, sample counts, or references to predefined splits) needed for reproducibility.
Hardware Specification	Yes	We ran all of our experiments on a 72-core Dell Power Edge R930 (with two-way hyper-threading) with 4 2.4GHz Intel 18-core E7-8867 v4 Xeon processors (with a 4800MHz bus and 45MB L3 cache) and 1TB of main memory.
Software Dependencies	No	The paper mentions software frameworks and libraries like CPAM, Aspen, scipy, and sklearn.datasets, but it does not provide specific version numbers for these dependencies.
Experiment Setup	Yes	We run the Par HAC and Seq HAC algorithm using ϵ = 0.1. We compute the k-approximate nearest neighbors using a shared-memory parallel implementation of the Vamana approximate nearest neighbors (ANN) algorithm [45] with parameters R = 75, L = 100, Q = max(L, k).