Supervised Hierarchical Clustering with Exponential Linkage

Authors: Nishant Yadav, Ari Kobren, Nicholas Monath, Andrew Mccallum

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In experiments on four datasets, our joint training procedure consistently matches or outperforms the next best training procedure/linkage function pair and gives up to 8 points improvement in dendrogram purity over discrepant pairs.
Researcher Affiliation Academia Nishant Yadav 1 Ari Kobren 1 Nicholas Monath 1 Andrew Mc Callum 1 1College of Information and Computer Sciences, University of Massachusetts Amherst, USA. Correspondence to: Nishant Yadav <nishantyadav@cs.umass.edu>, Ari Kobren <akobren@cs.umass.edu>, Nicholas Monath <nmonath@cs.umass.edu>, Andrew Mc Callum <mccallum@cs.umass.edu>.
Pseudocode Yes Pseudocode appears in Algorithm 1.
Open Source Code Yes Code for experiments is available at: https://github.com/ iesl/exp Linkage.
Open Datasets Yes We conduct experiments with the following four datasets: UMIST Face Data (Faces) (Graham & Allinson, 1998) : 564 gray-scale images with 20 ground-truth clusters. Noun Phrase Coreference (NP Coref) (Hasler et al., 2006): 104 documents, each contains clusters of coreferent noun phrases (NPs). Rexa (Culotta et al., 2007): 1459 bibliographic records of authors divided into 8 blocks w.r.t. unique first initial and last name. AMINER (Wang et al., 2011): 6730 publication records of authors divided into 100 blocks.
Dataset Splits Yes We randomly divide each dataset into 50 train/dev/test splits. For Faces, ...7 clusters are used for training, 6 for dev, and 7 for test set. For NP Coref, ...We use 62 documents for training, 10 for dev, and 32 for test. For Rexa, ...We use 3 blocks for training, 2 for dev and 3 for test. For AMINER, ...We use 60 blocks for training, 10 for dev, and 30 for test.
Hardware Specification No The paper mentions 'high performance computing equipment' in the acknowledgments but does not provide specific details such as exact GPU/CPU models, processor types, or memory amounts used for running experiments.
Software Dependencies No The paper describes the algorithms and models used (e.g., average perceptron), and mentions the availability of code, but it does not specify software dependencies with version numbers (e.g., Python, specific libraries like PyTorch or TensorFlow, or Scikit-learn versions) for replication.
Experiment Setup Yes For Faces, ...We use threshold, τ = 100, and margin, µ = 10, when computing the loss. For other datasets, ...with threshold, τ = 0, and margin, µ = 2.