Learning to Link
Authors: Maria-Florina Balcan, Travis Dick, Manuel Lang
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also carry out a comprehensive empirical evaluation of our techniques showing that they can lead to significantly improved clustering performance on real-world datasets. |
| Researcher Affiliation | Academia | Maria-Florina Balcan Carnegie Mellon University ninamf@cs.cmu.edu Travis Dick University of Pennsylvania tbd@seas.upenn.edu Manuel Lang Karlsruhe Institute of Technology manuel.lang@student.kit.edu |
| Pseudocode | Yes | Pseudocode for this method is given in Algorithm 1. In the pseudocode, clusters are represented by binary trees with leaves corresponding to the points belonging to that cluster. |
| Open Source Code | No | The paper does not include an unambiguous statement about releasing the source code for the described methodology or a direct link to a code repository. |
| Open Datasets | Yes | We learn the best algorithm and metric for clustering applications derived from MNIST, CIFAR-10, Omniglot, Places2, and a synthetic rings and disks distribution. |
| Dataset Splits | No | No explicit training/validation/test dataset splits were provided. The paper describes using 'N sample clustering tasks' from a distribution to compute average empirical loss, rather than splitting a fixed dataset for model training and validation. |
| Hardware Specification | No | No specific hardware details (GPU models, CPU models, memory specifications) used for running the experiments were provided. |
| Software Dependencies | No | No specific software dependencies with version numbers were mentioned. The paper describes conceptual algorithms and general types of metrics (e.g., neural network feature embeddings) but not the specific software or library versions used for implementation. |
| Experiment Setup | No | The paper describes the parameters of the algorithm family being learned (e.g., α for merge functions, β for metrics) and how clustering instances are generated, but it does not specify hyperparameters or system-level training settings for a machine learning model or a learning process itself (e.g., learning rate, batch size, optimizer settings). |