FisheyeHDK: Hyperbolic Deformable Kernel Learning for Ultra-Wide Field-of-View Image Recognition

Authors: Ola Ahmad, Freddy Lecue5968-5975

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Using synthetic distortion profiles, we demonstrate the effectiveness of our approach. We select two datasets Cityscapes and BDD100K 2020 of perspective images which we transform to fisheye equivalents at different scaling factors (analogue to focal lengths). Finally, we provide an experiment on data collected by a real fisheye camera. Validations and experiments show that our approach improves existing deformable kernel methods for CNN adaptation on fisheye images.
Researcher Affiliation Collaboration Ola Ahmad1, Freddy Lecue1, 2 1 Thales, Thales Digital Solutions, Canada 2 INRIA, France ola.ahmad@thalesdigital.io, freddy.lecue@inria.fr
Pseudocode No The paper describes the proposed approach through text and diagrams, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code No The paper mentions using existing open-source libraries (Pytorch-geometric, geoopt) but does not provide a link or explicit statement about releasing the source code for Fisheye HDK itself.
Open Datasets Yes We select two datasets Cityscapes and BDD100K 2020 of perspective images which we transform to fisheye equivalents at different scaling factors (analogue to focal lengths). [...] Cityscapes 1https://www.cityscapes-dataset.com/ [...] BDD100K 2https://bdd-data.berkeley.edu/
Dataset Splits Yes Cityscapes: The dataset comprises 5000 images divided into train, validation and test sets (2975, 500 and 1525 images respectively). [...] The validation set was used as the test set and the original training set was split in two (0.9/0.1 ratio) for training and validation purposes. BDD100K: Images and annotations are divided into train, validation and test sets (7000, 1000, 2000, respectively). [...] Similar to Cityscapes, we use the validation set as test set and split the original train set into two sets: 6500 for training and 500 for validation. Real data: We randomly split the dataset into three sets: 680 for training, 20 for validation and 100 images for testing.
Hardware Specification Yes All networks are trained on 2 Nvidia Tesla P100 GPUs each with 16Gb memory.
Software Dependencies No The paper mentions using "CUDA implementations" and the "open source library Pytorch-geometric" and "geoopt" but does not specify their version numbers.
Experiment Setup Yes We trained the model on 2 GPUs using synchronized batch-norm. We implemented the pixelwise weighted cross-entropy as a loss function. For baselines, we used Stochastic Gradient Decent (SGD) optimizer with momentum 0.9 and weight decay 5 10 4. The learning rate was initialized to 1 10 3 for the encoder and 1 10 2 for the decoder; both are updated using the poly learning rate policy. Our approach comprises both Euclidean and hyperbolic parameters. For hyperbolic parameters, we adopted the Riemannian SGD (RSGD) (Bonnabel 2013) with a learning rate (1 10 2) since they are manifold parameters and used usual SGD for the Euclidean parameters. [...] For the synthetic fisheye dataset, we set the training batch size to 16 and the validation batch size to 4. For real fisheye data, we set the batch size to 8 during training and validation. [...] In all experiments, we trained the models for 100 epochs.