reproducibilityindex.ai

Supervised Hierarchical Clustering with Exponential Linkage

Authors: Nishant Yadav, Ari Kobren, Nicholas Monath, Andrew Mccallum

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In experiments on four datasets, our joint training procedure consistently matches or outperforms the next best training procedure/linkage function pair and gives up to 8 points improvement in dendrogram purity over discrepant pairs.
Researcher Affiliation	Academia	Nishant Yadav 1 Ari Kobren 1 Nicholas Monath 1 Andrew Mc Callum 1 1College of Information and Computer Sciences, University of Massachusetts Amherst, USA. Correspondence to: Nishant Yadav <nishantyadav@cs.umass.edu>, Ari Kobren <akobren@cs.umass.edu>, Nicholas Monath <nmonath@cs.umass.edu>, Andrew Mc Callum <mccallum@cs.umass.edu>.
Pseudocode	Yes	Pseudocode appears in Algorithm 1.
Open Source Code	Yes	Code for experiments is available at: https://github.com/ iesl/exp Linkage.
Open Datasets	Yes	We conduct experiments with the following four datasets: UMIST Face Data (Faces) (Graham & Allinson, 1998) : 564 gray-scale images with 20 ground-truth clusters. Noun Phrase Coreference (NP Coref) (Hasler et al., 2006): 104 documents, each contains clusters of coreferent noun phrases (NPs). Rexa (Culotta et al., 2007): 1459 bibliographic records of authors divided into 8 blocks w.r.t. unique ﬁrst initial and last name. AMINER (Wang et al., 2011): 6730 publication records of authors divided into 100 blocks.
Dataset Splits	Yes	We randomly divide each dataset into 50 train/dev/test splits. For Faces, ...7 clusters are used for training, 6 for dev, and 7 for test set. For NP Coref, ...We use 62 documents for training, 10 for dev, and 32 for test. For Rexa, ...We use 3 blocks for training, 2 for dev and 3 for test. For AMINER, ...We use 60 blocks for training, 10 for dev, and 30 for test.
Hardware Specification	No	The paper mentions 'high performance computing equipment' in the acknowledgments but does not provide specific details such as exact GPU/CPU models, processor types, or memory amounts used for running experiments.
Software Dependencies	No	The paper describes the algorithms and models used (e.g., average perceptron), and mentions the availability of code, but it does not specify software dependencies with version numbers (e.g., Python, specific libraries like PyTorch or TensorFlow, or Scikit-learn versions) for replication.
Experiment Setup	Yes	For Faces, ...We use threshold, τ = 100, and margin, µ = 10, when computing the loss. For other datasets, ...with threshold, τ = 0, and margin, µ = 2.