Hyperbolic Busemann Learning with Ideal Prototypes
Authors: Mina Ghadimi Atigh, Martin Keller-Ressel, Pascal Mettes
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we show that our approach provides a natural interpretation of classification confidence, while outperforming recent hyperspherical and hyperbolic prototype approaches. Experiments on three datasets show that our approach outperforms both hyperspherical and hyperbolic prototype approaches. |
| Researcher Affiliation | Academia | Mina Ghadimi Atigh University of Amsterdam Martin Keller-Ressel Technische Universität Dresden Pascal Mettes University of Amsterdam |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is available at https://github.com/Mina Ghadimi Atigh/Hyperbolic-Busemann-Learning. |
| Open Datasets | Yes | We evaluate the effect of the penalty term on CIFAR-10 and CIFAR-100 using a Res Net-32 backbone. Following [29], we report on CIFAR-100, CIFAR-10, and CUB Birds 200-2011 [45] across multiple dimensionalities. We have performed a comparative evaluation on Activity Net [10] and Mini-Kinetics [48] for the search by action name task. |
| Dataset Splits | Yes | The trimmed dataset contains 23K videos and 200 classes, with a 15K split for training and 8K for validation. The Mini-Kinetics dataset consists of 83K trimmed videos and 200 classes, with a 78K split for training and 5K for validation. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'Adam' as an optimizer and 'Res Net-32 backbone' but does not provide specific version numbers for these software components or any other libraries/frameworks. |
| Experiment Setup | Yes | For the experiment, we use Adam with a learning rate of 5e-4, weight decay of 5e-5, batch size of 128, without pre-training. The network is trained for 1,110 epochs with learning rate decay of 10 after 1,000 and 1,100 epochs. |