Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds

Authors: Bingquan Dai, Luo Li, Qihong Tang, Jie Wang, Xinyu Lian, Hao Xu, Minghan Qin, Xudong XU, Bo Dai, Haoqian Wang, Zhaoyang Lyu, Jiangmiao Pang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our approach against existing shape-to-code methods, with experimental results and quantitative metrics demonstrating that our framework significantly outperforms prior work. Furthermore, our code-based representation enhances the reasoning capabilities of LLMs in 3D shape understanding tasks. We conduct training and evaluation on the Infinigen Indoor datasets [36]. For reconstruction performance, we compare our method with two representative shape-to-code baselines, Shape2Prog [1] and PLAD [2]. We adopt Io U and L2 CD as our evaluation metrics. In Table 1, we present reconstruction metrics for some specific object categories as well as the overall performance across the entire dataset. We conducted a series of ablation studies to evaluate the impact of various components within our model.
Researcher Affiliation	Collaboration	1Shanghai Artificial Intelligence Laboratory 2Tsinghua University 3The University of Hong Kong 4Harbin Institute of Technology 5Beijing Institute of Technology 6AI Thrust, HKUST(GZ)
Pseudocode	No	The paper provides examples of Blender Python scripts generated by Mesh Coder (e.g., in Figure 2 and Figure 9 for shape understanding questions), but it does not present the core Mesh Coder algorithm or methodology in a structured pseudocode block or algorithm format.
Open Source Code	No	Answer: [Yes] Yes, we will provide open access to the data and code. Guidelines: The answer NA means that paper does not include experiments requiring code. Please see the Neur IPS code and data submission guidelines (https://nips.cc/ public/guides/Code Submission Policy) for more details. While we encourage the release of code and data, we understand that this might not be possible, so No is an acceptable answer. Papers cannot be rejected simply for not including code, unless this is central to the contribution (e.g., for a new open-source benchmark). The instructions should contain the exact command and environment needed to run to reproduce the results. See the Neur IPS code and data submission guidelines (https: //nips.cc/public/guides/Code Submission Policy) for more details. The authors should provide instructions on data access and preparation, including how to access the raw data, preprocessed data, intermediate data, and generated data, etc. The authors should provide scripts to reproduce all experimental results for the new proposed method and baselines. If only a subset of experiments are reproducible, they should state which ones are omitted from the script and why. At submission time, to preserve anonymity, the authors should release anonymized versions (if applicable). Providing as much information as possible in supplemental material (appended to the paper) is recommended, but including URLs to data and code is permitted.
Open Datasets	Yes	We trained our model on the Infinigen Indoor [4] dataset. Infinigen Indoor is a procedural framework for generating synthetic 3D indoor objects, where each generated instance is automatically composed by its corresponding parts. We have made extensive modifications to the original Infinigen codebase to enable it to produce both individual components and their complete assemblies. Using this framework, we constructed a synthetic dataset comprising 41 common object categories, generating 1 million object-code pairs in total.
Dataset Splits	Yes	We partitioned the dataset into 70% for training, 15% for validation, and 15% for testing. ... We partitioned the dataset into training, validation, and test sets, following the same split strategy as the Synthetic Part Dataset.
Hardware Specification	Yes	For the part-to-code reconstruction model, we adopt the Adam W optimizer and train it for 20 epochs on NVIDIA A100 GPUs for about a week with a batch size of 512, and a learning rate of 10 4. ... It is trained on 64 NVIDIA A100 GPUs for about 2 days.
Software Dependencies	No	We train a multimodal large language model (LLM) that translates 3D point cloud into executable Blender Python scripts. ... We use Llama-3.2-1B as the base LLM and finetune it using Lo RA.
Experiment Setup	Yes	For the part-to-code reconstruction model, we adopt the Adam W optimizer and train it for 20 epochs on NVIDIA A100 GPUs for about a week with a batch size of 512, and a learning rate of 10 4. We evaluate the model at every epoch and select the checkpoint with the lowest L2 Chamfer Distance (CD) loss. Then we initialize the weights of the object-to-code reconstruction model with the weights of the trained part-to-code inference model, and train the model on Infinigen Indoor dataset for 10 epochs, with a batch size of 256, and a learning rate of 10 4. It is trained on 64 NVIDIA A100 GPUs for about 2 days. The checkpoint with the lowest CD loss is selected. To further enhance the robustness and generalization ability of the object-to-code inference model, we apply data augmentation techniques. Specifically, we perform random rotation and scaling on the objects. Additionally, during training, we randomly sample the number of points in each point cloud within the range of 4096 to 16384, and add Gaussian noise to further perturb the input.