Hyperbolic Busemann Learning with Ideal Prototypes

Authors: Mina Ghadimi Atigh, Martin Keller-Ressel, Pascal Mettes

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we show that our approach provides a natural interpretation of classification confidence, while outperforming recent hyperspherical and hyperbolic prototype approaches. Experiments on three datasets show that our approach outperforms both hyperspherical and hyperbolic prototype approaches.
Researcher Affiliation Academia Mina Ghadimi Atigh University of Amsterdam Martin Keller-Ressel Technische Universität Dresden Pascal Mettes University of Amsterdam
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes The code is available at https://github.com/Mina Ghadimi Atigh/Hyperbolic-Busemann-Learning.
Open Datasets Yes We evaluate the effect of the penalty term on CIFAR-10 and CIFAR-100 using a Res Net-32 backbone. Following [29], we report on CIFAR-100, CIFAR-10, and CUB Birds 200-2011 [45] across multiple dimensionalities. We have performed a comparative evaluation on Activity Net [10] and Mini-Kinetics [48] for the search by action name task.
Dataset Splits Yes The trimmed dataset contains 23K videos and 200 classes, with a 15K split for training and 8K for validation. The Mini-Kinetics dataset consists of 83K trimmed videos and 200 classes, with a 78K split for training and 5K for validation.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper mentions using 'Adam' as an optimizer and 'Res Net-32 backbone' but does not provide specific version numbers for these software components or any other libraries/frameworks.
Experiment Setup Yes For the experiment, we use Adam with a learning rate of 5e-4, weight decay of 5e-5, batch size of 128, without pre-training. The network is trained for 1,110 epochs with learning rate decay of 10 after 1,000 and 1,100 epochs.