Robust large-margin learning in hyperbolic space

Authors: Melanie Weber, Manzil Zaheer, Ankit Singh Rawat, Aditya K. Menon, Sanjiv Kumar

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this paper, we present, to our knowledge, the first theoretical guarantees for learning a classifier in hyperbolic rather than Euclidean space. Specifically, we consider the problem of learning a large-margin classifier for data possessing a hierarchical structure. Our first contribution is a hyperbolic perceptron algorithm, which provably converges to a separating hyperplane. We then provide an algorithm to efficiently learn a large-margin hyperplane, relying on the careful injection of adversarial examples. Finally, we prove that for hierarchical data that embeds well into hyperbolic space, the low embedding dimension ensures superior guarantees when learning the classifier directly in hyperbolic space. We now present empirical studies for hyperbolic linear separator learning to corroborate our theory.
Researcher Affiliation Collaboration Melanie Weber Princeton University mw25@math.princeton.edu Manzil Zaheer Google Research manzilzaheer@google.com Ankit Singh Rawat Google Research ankitsrawat@google.com Aditya Menon Google Research adityakmenon@google.com Sanjiv Kumar Google Research sanjivk@google.com
Pseudocode Yes Algorithm 1 Hyperbolic perceptron, Algorithm 2 Adversarial Training
Open Source Code No The paper does not provide any explicit statement about releasing source code for the methodology or a link to a code repository.
Open Datasets Yes We use the Image Net ILSVRC 2012 dataset [26] along with its label hierarchy from wordnet.
Dataset Splits No The paper describes the datasets used (ImageNet ILSVRC 2012, specific classes, and subtrees with example counts) but does not provide explicit train/validation/test splits (e.g., percentages, counts for each split, or methods like cross-validation) for reproducibility.
Hardware Specification No The paper does not explicitly describe the hardware (e.g., specific GPU/CPU models, memory amounts) used to run its experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., library names with their versions) needed to replicate the experiment.
Experiment Setup Yes We vary the budget α over {0, 0.25, 0.5, 0.75, 1.0}. In all experiments, we use a constant step-size ηt = 0.01 t.