Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
LiteReality: Graphics-Ready 3D Scene Reconstruction from RGB-D Scans
Authors: Zhening Huang, Xiaoyang Wu, Fangcheng Zhong, Hengshuang Zhao, Matthias Niessner, Joan Lasenby
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate Lite Reality both end-to-end and per stage. Since no prior work directly tackles this task, we propose three benchmarks: retrieval similarity, object-centric PBR material estimation, and full-scene graphics-ready reconstruction. These enable quantitative comparisons with prior work and provide a foundation for future research. |
| Researcher Affiliation | Academia | 1University of Cambridge 2The University of Hong Kong 3Technical University of Munich |
| Pseudocode | No | The paper describes the methodology using textual explanations and pipeline diagrams (e.g., Figure 2, Figure 3, Figure 4, Figure 15) but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: Yes, the dataset and code used will be released to reproduce the results and to test the proposed method on real-life scans. |
| Open Datasets | Yes | We evaluate Lite Reality on the Scan Net dataset [8] and on real-world indoor scans captured with an i Phone. ... To evaluate the performance of proposed object retrieval method, we conduct experiments on the Scan Net dataset [8]. ... To build this database, we curated assets primarily from 3D-Future [14] and AI2-THOR [24]. For underrepresented categories, we supplemented assets from Sketchfab [51] under public-access licenses. |
| Dataset Splits | Yes | Retrieval Similarity Benchmarking. To evaluate the performance of proposed object retrieval method, we conduct experiments on the Scan Net dataset [8]. ... Experiments use the Scan Net validation scenes. |
| Hardware Specification | Yes | Runtime Analysis On a single NVIDIA RTX 3090 GPU with 24 GB of memory, the complete reconstruction of a room-scale scene takes between 20 and 60 minutes, depending on scene complexity. |
| Software Dependencies | No | The paper mentions software and models like Apple's Room Plan, Blender, SAM [23], Grounding DINO, Multi-Modal Large Language Model (MLLM), CLIP, GPT-4, and DINOv2[45]. However, it does not specify version numbers for these software packages, libraries, or models (e.g., Blender version, Python version, specific deep learning framework versions like PyTorch/TensorFlow). |
| Experiment Setup | Yes | In the subsequent pose-aware rendering and comparison step, these candidates are placed into the scene according to the detected pose and rendered from the same camera angles. The resulting views are then cropped and re-encoded for visual feature extraction, further narrowing the selection to the top four candidates. Finally, a contextual selection step employs a language model to assess high-level attributes such as style, proportion, and visual coherence, yielding the best-matched object. ... For each object, we selected the four most visible frames and tightly cropped each to isolate the object... |