Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Rig3R: Rig-Aware Conditioning and Discovery for 3D Reconstruction

Authors: Samuel Li, Pujith Kachana, Prajwal Chidananda, Saurabh Nair, Yasutaka Furukawa, Matthew A Brown

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments across diverse real-world driving datasets show that Rig3R achieves state-of-the-art performance in 3D reconstruction, camera pose estimation, and rig discovery, outperforming both traditional and learned methods, all in a single forward pass. ... Section 4 Experiments: Evaluation Data. ... Table 1: Multi-view pose estimation results... Table 2: Multi-view pointmap estimation results.
Researcher Affiliation Collaboration Samuel Li 1,2 Pujith Kachana 1,2 Prajwal Chidananda1 Saurabh Nair1 Yasutaka Furukawa1 Matthew Brown1 1 Wayve Technologies 2 Carnegie Mellon University
Pseudocode No The paper describes the model architecture and methods in text and with diagrams (e.g., Figure 2), but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No While we do not provide open access to code at submission time, the method is clearly described and most datasets are publicly available for reproducibility; we will look into releasing code and a model trained only on public datasets in the future.
Open Datasets Yes We train Rig3R on a diverse data mix: CO3D-v2 [60], Blended MVS [61], Mapfree [62], Scan Net++ v2 [63], MVImg Net [64], Point Odyssey [65], Virtual KITTI2 [66], Tartan Air V2 [67], Panda Set [68], KITTI [69], Argoverse2 [70], nu Scenes [71], Waymo [72], and an internal dataset.
Dataset Splits Yes We evaluate Rig3R on the Waymo Open [72] validation set and Wayve Scenes101 [74]... For each scene, we extract two 24-frame samples, each using the full 5-camera rig spaced approximately 2 seconds apart. ... Rig3R is trained on 24-frame samples with a batch size of 128...
Hardware Specification Yes Rig3R is trained on 24-frame samples with a batch size of 128, using 128 H100 GPUs for 250k steps over 5 days.
Software Dependencies No The paper mentions specific model components like 'Vi T-Large encoder' and 'DPT module' but does not specify software dependencies like programming language versions or library versions (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes Rig3R is trained on 24-frame samples with a batch size of 128, using 128 H100 GPUs for 250k steps over 5 days. Images are resized to 512 512 with padding. We apply data augmentations including random per-frame color jitter, Gaussian blur, and centered aspect-ratio crops to simulate variation in focal length and image shape. During training, input sequences are randomly shuffled to vary the reference frame and promote generalization. We use the Adam W optimizer with a learning rate of 0.0001 and cosine annealing.