LF-Net: Learning Local Features from Images
Authors: Yuki Ono, Eduard Trulls, Pascal Fua, Kwang Moo Yi
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our models outperform the state of the art on sparse feature matching on both datasets, while running at 60+ fps for QVGA images. |
| Researcher Affiliation | Collaboration | Yuki Ono Sony Imaging Products & Solutions Inc. yuki.ono@sony.com Eduard Trulls École Polytechnique Fédérale de Lausanne eduard.trulls@epfl.ch Pascal Fua École Polytechnique Fédérale de Lausanne pascal.fua@epfl.ch Kwang Moo Yi Visual Computing Group, University of Victoria kyi@uvic.ca |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our implementation is written in Tensor Flow and is publicly available.1 https://github.com/vcg-uvic/lf-net-release |
| Open Datasets | Yes | For indoors data we rely on Scan Net [10], an RGB-D dataset... For outdoors data we use 25 photo-tourism image collections of popular landmarks collected by [16, 39]. |
| Dataset Splits | Yes | The dataset provides training, validation, and test splits that we use accordingly. [...] We use 14 sequences for training and validation, spliting the images into training and validation subsets by with a 70:30 ratio, and sample up to 50k pairs from each different scene. |
| Hardware Specification | Yes | Even so, our implementation can extract 512 keypoints from QVGA frames (320 240) at 62 fps and from VGA frames (640 480) at 25 fps (42 and 20 respectively for 1024 keypoints), on a Titan X PASCAL. |
| Software Dependencies | No | The paper states 'Our implementation is written in Tensor Flow' but does not specify a version number for TensorFlow or any other software dependencies. |
| Experiment Setup | Yes | For optimization, we use ADAM [20] with a learning rate of 10 3. To balance the loss function for the detector network we use λpair = 0.01, and λori = λscale = 0.1. [...] While training we extract 512 keypoints, as larger numbers become problematic due to memory constraints. This also allows us to maintain a batch with multiple image pairs (6), which helps convergence. |