Weatherproofing Retrieval for Localization with Generative AI and Geometric Consistency

Authors: Yannis Kalantidis, Mert Bülent Sarıyıldız, Rafael S. Rezende, Philippe Weinzaepfel, Diane Larlus, Gabriela Csurka

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that it leads to large improvements on multiple challenging visual localization and place recognition benchmarks.
Researcher Affiliation Industry Yannis Kalantidis Mert B ulent Sarıyıldız Rafael S. Rezende Philippe Weinzaepfel Diane Larlus Gabriela Csurka NAVER LABS Europe
Pseudocode No The paper describes its methods textually and with mathematical equations but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes Project page: https://europe.naverlabs.com/ret4loc
Open Datasets Yes Following HOW (Tolias et al., 2020), we use the Sf M-120K dataset (Radenovi c et al., 2019) to train all Ret4Loc models.
Dataset Splits Yes Extended CMU Seasons (Badino et al., 2011; Toft et al., 2022) ... These 24 sequences are split into a validation set, composed of 10 sequences for which the query poses are made available, and a test set, composed of 14 sequences for which no query poses are available...
Hardware Specification Yes Our model trains in less than 8 hours on a single A100 GPU (more details in Appendix B.6).
Software Dependencies No The paper mentions using 'torchvision (Paszke et al., 2019)' and 'Aug Mix (Hendrycks et al., 2020)' and building on the 'HOW codebase', but it does not specify explicit version numbers for these software dependencies (e.g., 'PyTorch 1.9' or 'torchvision 0.x').
Experiment Setup Yes We train Res Net50 models... We use randomly resized crops of size 768 × 768... we apply Aug Mix... with its default hyper-parameters suggested by the authors: severity of augmentation operators: 3, width of augmentation chain: 3, depth of augmentation chain: -1, probability coefficient for Beta and Dirichlet distributions: 1... We perform episodic training on Sf M-120K... in each episode, we sample 2000 random query-positive pairs and a pool of 20000 images... We train each model for 50 such episodes... We used M = 5 negatives in the tuple... learning rate 1e-5 and weight decay 3e-2 work the best in many cases.