Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

GeoCAD: Local Geometry-Controllable CAD Generation with Large Language Models

Authors: Zhanwei Zhang, kaiyuan liu, Junjie Liu, Wenxiao Wang, Binbin Lin, Liang Xie, Chen Shen, Deng Cai

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate the effectiveness of Geo CAD in generation quality, validity and text-to-CAD consistency.
Researcher Affiliation	Collaboration	1 State Key Lab of CAD&CG, Zhejiang University 2 Alibaba Cloud Computing, 3 Zhejiang University of Technology 4 School of Software Technology, Zhejiang University
Pseudocode	No	The paper describes the methodology in Section 3 and illustrates processes with figures, but does not include a structured pseudocode or algorithm block.
Open Source Code	Yes	Code will be available at https://github.com/Zhanwei-Z/Geo CAD.
Open Datasets	Yes	To maintain consistency with prior research [61], we evaluate our Geo CAD on Deep CAD [45], a large-scale 3D sketch-extrude-modeling CAD dataset.
Dataset Splits	Yes	This dataset contains 178,238 sketch-extrusion sequences, which are randomly partitioned into training, validation, and test subsets at a 90%-5%-5% ratio.
Hardware Specification	Yes	The model is trained on 8 A100 GPUs using Adam W [30], with a batch size of 32, a cosine annealing learning rate initialized at 5 10 4, and trained for 10 and 30 epochs across stage 1 and stage 2.
Software Dependencies	No	The paper mentions using Llama-3-8B as the base LLM and Lo RA for fine-tuning, but does not provide specific version numbers for software dependencies like Python, PyTorch, or CUDA.
Experiment Setup	Yes	We use the same Lo RA [16] setting as used in [61], with a rank of 8 and an alpha of 32... The model is trained on 8 A100 GPUs using Adam W [30], with a batch size of 32, a cosine annealing learning rate initialized at 5 10 4, and trained for 10 and 30 epochs across stage 1 and stage 2. During inference, we set the temperature τ and Top-p at 0.9 and 0.9 to balance quality and validity in local generation.