reproducibilityindex.ai

LoCo: Learning 3D Location-Consistent Image Features with a Memory-Efficient Ranking Loss

Authors: Dominik Kloepfer, João F. Henriques, Dylan Campbell

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We showcase the improved location consistency of our trained feature extractor directly on a multi-view consistency task, as well as the downstream task of scene-stable panoptic segmentation, significantly outperforming previous state-of-the-art.
Researcher Affiliation	Academia	Dominik A. Kloepfer Visual Geometry Group University of Oxford dominik@robots.ox.ac.uk João Henriques Visual Geometry Group University of Oxford joao@robots.ox.ac.uk Dylan Campbell School of Computing Australian National University dylan.campbell@anu.edu.au
Pseudocode	No	The paper describes methods and architectures but does not include explicit pseudocode or algorithm blocks.
Open Source Code	No	We provide further implementation details regarding the efficient sampling of patch pairs in Appendix D, and will publicly release our training code.
Open Datasets	Yes	The training dataset we use comprises 59 environments of the Matterport3D dataset, resizing the images to 256 320 pixels. The Matterport3D dataset is particularly suitable for our task of enforcing multi-view consistency due to its diversity and the way it captures varied viewpoints of the same scene through panorama cropping. ... Like El Banani et al., we evaluate on the Paired Scan Net [10] split proposed by Sarlin et al. [36].
Dataset Splits	No	The paper discusses training and testing datasets, but does not explicitly provide details about a validation dataset split.
Hardware Specification	Yes	In contrast, Cro Co trains the entire network (85 million parameters) with a multi-view loss and on significantly larger datasets with greater computational resources (8 A100 GPUs vs. 1 RTX8000 GPU).
Software Dependencies	No	The paper mentions the FAISS library but does not provide specific version numbers for software dependencies like Python, PyTorch, or CUDA.
Experiment Setup	Yes	We use values of ρ = 0.5m for the positive radius, κ = 5.0m for the negative radius, τ = 0.01 for the sigmoid temperature, and = 0.076 for the saturation threshold. ... Instead, we adapt the architecture used by DINO-Tracker [41], keeping pre-trained DINO [5] features frozen and training a convolutional neural network to learn additive residuals to those features. Table 3: Hyperparameters of the convolutional layers of the residual network used for the pixel-correspondence task in Section 4.3.