Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

PLANA3R: Zero-shot Metric Planar 3D Reconstruction via Feed-forward Planar Splatting

Authors: Changkun Liu, Bin Tan, Zeran Ke, Shangzhan Zhang, Jiachen Liu, Ming Qian, Nan Xue, Yujun Shen, Tristan Braud

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate PLANA3R on multiple indoor-scene datasets with metric supervision and demonstrate strong generalization to out-of-domain indoor environments across diverse tasks under metric evaluation protocols, including 3D surface reconstruction, depth estimation, and relative pose estimation.
Researcher Affiliation	Collaboration	Changkun Liu1,2 Bin Tan2 Zeran Ke2,3 Shangzhan Zhang2,4 Jiachen Liu5 Ming Qian2,3 Nan Xue2 Yujun Shen2 Tristan Braud1 1The Hong Kong University of Science and Technology 2Ant Group 3Wuhan University 4Zhejiang University 5The Pennsylvania State University
Pseudocode	No	The paper describes the architecture and methodology in detail across sections like '3.2 Hierarchical Primitive Prediction Architecture' and '3.3 Training Losses and Training Strategies', but it does not include an explicit pseudocode block or algorithm listing.
Open Source Code	Yes	The project page is available at: https: //lck666666.github.io/plana3r/.
Open Datasets	Yes	Since PLANA3R targets structured indoor scenes, we train it on a combination of four public indoor-scene datasets: Scan Net V2 [4], Scan Net++ [39], ARKit Scenes [5], and Habitat [23]. For evaluation, we use Scan Net V2, Matterport3D [2], NYUv2 [20], and Replica [26] as test sets.
Dataset Splits	Yes	For Scan Net V2, we follow the training and testing splits defined by NOPE-SAC [28], evaluating 4051 image pairs from 303 scenes. For our experiments, we use the image splits defined in [24, 15] with 654 test frames. In our training set (totally around 4M image pairs), we include approximately 0.57M image pairs with no overlap, while the remaining 3.43M pairs are randomly sampled from nearby frames (mainly within the next 10 frames).
Hardware Specification	Yes	The model is trained for a total of 256 GPU-days on NVIDIA H20 GPUs, with a per-GPU batch size of 6. We evaluate the inference runtime of our PLANA3R using an NVIDIA RTX 3090 GPU.
Software Dependencies	No	The paper mentions using 'Adam W optimizer [19]' and initializing parts of the model with 'DUSt3R s pre-trained 512-DPT weights', but does not specify software versions for libraries like PyTorch, TensorFlow, or specific Python versions, which are necessary for reproducible software dependencies.
Experiment Setup	Yes	We initialize the Vi T encoder and the transformer decoder s part of PLANA3R model with DUSt3R s pre-trained 512-DPT weights. Training is performed using the Adam W optimizer [19] with a learning rate starting at 1 10 4 and decaying to 1 10 6. The model is trained for a total of 256 GPU-days on NVIDIA H20 GPUs, with a per-GPU batch size of 6. Training starts with a one-epoch warm-up phase that optimizes only the losses in Eq. (3) and Eq. (5), followed by 10 epochs incorporating all three losses at an input resolution of 512 384. During both training and testing, we set the gradient threshold gth for merging highand low-resolution primitives to 0.5. For our final model used for evaluation, we set the loss weights α1 = 5, α2 = 5, α3 = 20 in Eq. (3). We set the loss weights β1 = 1, β2 = 1, β3 = 2 in Eq. (4). We set the loss weights γ1 = 10, γ2 = 10, γ3 = 1 in Eq. (5).