Fully Hyperbolic Convolutional Neural Networks for Computer Vision

Authors: Ahmad Bdeir, Kristian Schwethelm, Niels Landwehr

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate hyperbolic models on image classification and generation tasks and compare them against Euclidean and hybrid HNN counterparts from the literature. To ensure a fair comparison, in every task, we directly translate a Euclidean baseline to the hyperbolic setting by using hyperbolic modules as one-to-one replacements. All experiments are implemented in Py Torch (Paszke et al., 2019), and we optimize hyperbolic models using adaptive Riemannian optimizers (Bécigneul & Ganea, 2018) provided by Geoopt (Kochurov et al., 2020), with floating-point precision set to 32 bits. We provide detailed experimental configurations in Appendix C and ablation experiments in Appendix D.
Researcher Affiliation Academia Ahmad Bdeir1, , Kristian Schwethelm2, , & Niels Landwehr1 1 Data Science Department, University of Hildesheim 2 Chair for Artificial Intelligence in Medicine, Technical University of Munich
Pseudocode No The paper describes methods mathematically and verbally but does not include any explicit pseudocode or algorithm blocks.
Open Source Code Yes Our code is publicly available at https://github.com/kschwethelm/Hyperbolic CV.
Open Datasets Yes We evaluate image classification performance using Res Net-18 (He et al., 2015b) and three datasets: CIFAR-10 (Krizhevsky, 2009), CIFAR-100 (Krizhevsky, 2009), and Tiny-Image Net (Le & Yang, 2015). ... Building on the experimental setting of Ghosh et al. (2019), we test vanilla VAEs and assess generative performance on CIFAR-10 (Krizhevsky, 2009), CIFAR-100 (Krizhevsky, 2009), and Celeb A (Liu et al., 2015) datasets. ... For this, the models are retrained on the MNIST (Lecun et al., 1998) dataset with an embedding dimension d E = 2.
Dataset Splits Yes The CIFAR-10 and CIFAR-100 datasets each contain 60,000 32 32 colored images from 10 and 100 different classes, respectively. We use the dataset split implemented in Py Torch, which includes 50,000 training images and 10,000 testing images. ... Tiny-Image Net... Here, we use the official validation split for testing our models. ... Here, we use the Py Torch implementation, containing 162,770 training images, 19,867 validation images, and 19,962 testing images. ... The reconstruction FID... is calculated by comparing test images with reconstructed validation images. As the CIFAR datasets have no official validation set, we exclude a fixed random portion of 10,000 images from the training set.
Hardware Specification Yes The results show that hybrid HNNs only add little overhead compared to the significantly slower HCNN. This makes scaling HCNNs challenging and requires special attention in future works. However, we also see that hyperbolic models gain much more performance from the automatic compilation than the Euclidean model. This indicates greater room for improvement in terms of implementation optimizations. ... GPU type RTX A5000
Software Dependencies No All experiments are implemented in Py Torch (Paszke et al., 2019), and we optimize hyperbolic models using adaptive Riemannian optimizers (Bécigneul & Ganea, 2018) provided by Geoopt (Kochurov et al., 2020)... We evaluate the VAEs by employing two versions of the FID (Heusel et al., 2017) implemented by Seitzer (2020). While software names like PyTorch, Geoopt, and Seitzer's FID implementation are mentioned, specific version numbers for these dependencies are not provided.
Experiment Setup Yes Table 4: Summary of hyperparameters used in training classification models. Hyperparameter Value Epochs 200 Batch size 128 Learning rate (LR) 1e-1 Drop LR epochs 60, 120, 160 Drop LR gamma 0.2 Weight decay 5e-4 Optimizer (Riemannian)SGD Floating point precision 32 bit GPU type RTX A5000 Num. GPUs 1 or 2 Hyperbolic curvature K 1 Table 6: Summary of hyperparameters used in training image generation models. Hyperparameter MNIST CIFAR-10/100 CELEBA Epochs 100 100 70 Batch size 100 100 100 Learning rate 5e-4 5e-4 5e-4 Weight decay 0 0 0 KL loss weight 0.312 0.024 0.09 Optimizer (Riem)Adam (Riem)Adam (Riem)Adam Floating point precision 32 bit 32 bit 32 bit GPU type RTX A5000 RTX A5000 RTX A5000 Num. GPUs 1 1 2