Robust large-margin learning in hyperbolic space
Authors: Melanie Weber, Manzil Zaheer, Ankit Singh Rawat, Aditya K. Menon, Sanjiv Kumar
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we present, to our knowledge, the first theoretical guarantees for learning a classifier in hyperbolic rather than Euclidean space. Specifically, we consider the problem of learning a large-margin classifier for data possessing a hierarchical structure. Our first contribution is a hyperbolic perceptron algorithm, which provably converges to a separating hyperplane. We then provide an algorithm to efficiently learn a large-margin hyperplane, relying on the careful injection of adversarial examples. Finally, we prove that for hierarchical data that embeds well into hyperbolic space, the low embedding dimension ensures superior guarantees when learning the classifier directly in hyperbolic space. We now present empirical studies for hyperbolic linear separator learning to corroborate our theory. |
| Researcher Affiliation | Collaboration | Melanie Weber Princeton University mw25@math.princeton.edu Manzil Zaheer Google Research manzilzaheer@google.com Ankit Singh Rawat Google Research ankitsrawat@google.com Aditya Menon Google Research adityakmenon@google.com Sanjiv Kumar Google Research sanjivk@google.com |
| Pseudocode | Yes | Algorithm 1 Hyperbolic perceptron, Algorithm 2 Adversarial Training |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code for the methodology or a link to a code repository. |
| Open Datasets | Yes | We use the Image Net ILSVRC 2012 dataset [26] along with its label hierarchy from wordnet. |
| Dataset Splits | No | The paper describes the datasets used (ImageNet ILSVRC 2012, specific classes, and subtrees with example counts) but does not provide explicit train/validation/test splits (e.g., percentages, counts for each split, or methods like cross-validation) for reproducibility. |
| Hardware Specification | No | The paper does not explicitly describe the hardware (e.g., specific GPU/CPU models, memory amounts) used to run its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., library names with their versions) needed to replicate the experiment. |
| Experiment Setup | Yes | We vary the budget α over {0, 0.25, 0.5, 0.75, 1.0}. In all experiments, we use a constant step-size ηt = 0.01 t. |