Learning Representations for Hierarchies with Minimal Support
Authors: Benjamin Rozonoyer, Michael Boratko, Dhruvesh Patel, Wenlong Zhao, Shib Dasgupta, Hung Le, Andrew McCallum
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We achieve robust performance on synthetic hierarchies and a larger real-world taxonomy, observing improved convergence rates in a resource-constrained setting while reducing the set of training examples by as much as 99%. |
| Researcher Affiliation | Collaboration | Benjamin Rozonoyer1 Michael Boratko2 Dhruvesh Patel1 Wenlong Zhao1 Shib Dasgupta1 Hung Le1 Andrew Mc Callum1 1University of Massachusetts Amherst 2Google Deep Mind {brozonoyer,dhruveshpate,wenlongzhao,ssdasgupta,hungle,mccallum}@cs.umass.edu mboratko@google.com |
| Pseudocode | Yes | Algorithm 1 FINDMINDISTINGUISHER |
| Open Source Code | Yes | 1Our code and data are available at https://github.com/iesl/geometric-graph-embedding. |
| Open Datasets | Yes | We evaluate hierarchy-aware sampling on the heterogeneous synthetic DAGs for which Boratko et al. [2021a] demonstrated superior performance using GT-BOX and random uniform negative sampling: balanced trees, where b is the branching factor, the nested Chinese restaurant process (n CRP) [Blei et al., 2010], where α is the normalized new table probability, and Price s model [Price, 1976], where m is the number of connections for a new node and c is a constant factor added to the probability of a vertex receiving an edge4. We also test on the larger real-world Medical Subject Headings (Me SH) taxonomy Lipscomb [2000], 2020 release. |
| Dataset Splits | No | The paper mentions training, but no explicit validation splits are provided in the main text. It refers to "metrics" and "convergence" but not specific dataset divisions for validation. |
| Hardware Specification | Yes | The experiments were run in a single-node, single-GPU setup on a cluster with the following GPU architectures: NVIDIA Tesla M40 (24GB VRAM) NVIDIA Ge Force GTX TITAN X (12GB VRAM) NVIDIA Ge Force GTX 1080 Ti (11GB VRAM) NVIDIA RTX 2080ti (11GB VRAM) NVIDIA Quadro RTX 8000 (48GB VRAM) |
| Software Dependencies | No | The paper mentions using W&B Biewald et al. [2020] for hyperparameter optimization but does not provide specific version numbers for any software dependencies like programming languages or libraries. |
| Experiment Setup | Yes | For each of the final experiment runs, we fetch the best learning rate and λneg and run for a full 40 epochs for Etc or 200 for Etr, recording F1 at every epoch to produce a convergence plot. |