Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding

Authors: junliang ye, Zhengyi Wang, Ruowen Zhao, Shenghao Xie, Jun Zhu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results provide strong empirical evidence supporting the effectiveness of the proposed method.
Researcher Affiliation	Collaboration	Junliang Ye1,3 Zhengyi Wang1,3 Ruowen Zhao1 Shenghao Xie2 Jun Zhu1,3 Tsinghua University1 Peking University2 Sheng Shu 3
Pseudocode	No	The paper contains architectural diagrams and flowcharts (Figure 1, 2, 7) but no explicit pseudocode blocks or algorithm sections.
Open Source Code	Yes	https://github.com/JAMESYJL/Shape LLM-Omni/
Open Datasets	Yes	To enable LLMs with 3D ability, we construct a comprehensive training dataset using 3D shapes from a mixture of 3D datasets Deitke et al. [2023a,b], Collins et al. [2022], Chang et al. [2015]. ... We introduce 3D-alpaca, a comprehensive dataset encompassing tasks in 3D content generation, comprehension, and editing. ... Complete documentation will be provided with all new assets.
Dataset Splits	No	The paper mentions datasets and their sizes, such as "2.56 million samples, comprising 3.46 billion tokens" for 3D-Alpaca, and refers to "test set" in evaluations like "randomly select 1000 samples from the test set" and "Toys4K Stojanov et al. [2021] test dataset", but it does not provide explicit train/validation/test splits (e.g., percentages or specific counts for all splits) for its primary constructed datasets or the LLM training.
Hardware Specification	Yes	Concretely, each stage runs for 1000 steps on 48 NVIDIA H100 GPUs with a batch size of 25... The model is trained for 15 epochs on 48 NVIDIA H100 GPUs.
Software Dependencies	No	The paper mentions using "Qwen-2.5-VL-Instruct-7B Bai et al. [2025]" as the backbone model, but it does not specify version numbers for other general software dependencies like programming languages, libraries, or operating systems (e.g., Python, PyTorch, CUDA).
Experiment Setup	Yes	Concretely, each stage runs for 1000 steps on 48 NVIDIA H100 GPUs with a batch size of 25, while the learning rate decays from 5 10 3 to 5 10 5. For the training of Shape LLM-Omni...the learning rate decays from 5 10 5 to 5 10 6, with a per-GPU batch size of 2 and gradient accumulation over 2 steps. The model is trained for 15 epochs... We use the Adam W optimizer, with a learning rate of 1e-5, a warm-up of 400 steps with cosine scheduling, and a global batch size of 192. The total training time is around 5 days.