Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Cue3D: Quantifying the Role of Image Cues in Single-Image 3D Generation

Authors: Xiang Li, Zirui Wang, Zixuan Huang, James M. Rehg

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our unified benchmark evaluates seven state-of-the-art methods, spanning regression-based, multi-view, and native 3D generative paradigms. By systematically perturbing cues such as shading, texture, silhouette, perspective, edges, and local continuity, we measure their impact on 3D output quality.
Researcher Affiliation	Academia	Xiang Li Zirui Wang Zixuan Huang James M. Rehg University of Illinois at Urbana-Champaign
Pseudocode	No	The paper describes methods in prose and does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	We will release our code repository upon acceptance. We are use public code for the methods and open datasets.
Open Datasets	Yes	We select two standard evaluation datasets for all methods: GSO [12], a dataset of high-quality scanned household items, and Toys4K[47], a collection of user-created 3D toy objects. ... To probe performance on shapes without semantic meaning, we additionally use Zeroverse [57], a procedurally generated dataset built from random assemblies of textured primitive.
Dataset Splits	Yes	Our final evaluation sets contains 412 objects from the cleaned GSO dataset and 500 randomly sampled objects from the cleaned Toys4K dataset.
Hardware Specification	Yes	We use 8 NVIDIA L40S GPU for all our experiments.
Software Dependencies	No	The paper mentions rendering in Blender and using the 'official implementation for all methods', but does not specify versions for any key software libraries or dependencies used in their own experimental framework.
Experiment Setup	Yes	We align the output mesh to the groundtruth following [6]. ... We render 16 views for each object with 8 uniform azimuth and 2 elevations. ... Our input images are 512 512 pixels in resolution. For silhouette dilation, we apply a dilation kernel of 10 pixels for the weak variant, 30 pixels for the medium variant, and 60 pixels for the strong variant. For occlusion, we randomly position an occluder mask along the edge of the object and scale it by a factor of 0.1, 0.4, or 0.8 for the weak, medium, and strong variants, respectively. For pixel shuffle, we randomly shuffle all pixels within each non-overlapping N N grid inside the object mask, with N set to 2, 4, 10, or 20 to represent different perturbation strength.