Vitruvion: A Generative Model of Parametric CAD Sketches

Authors: Ari Seff, Wenda Zhou, Nick Richardson, Ryan P Adams

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Evaluation of the proposed approach demonstrates its ability to synthesize realistic CAD sketches and its potential to aid the mechanical design workflow. ... We evaluate several versions of our model in both primitive generation and constraint generation settings. Quantitative assessment is conducted by measuring negative log-likelihood (NLL) and predictive accuracy on a held-out test set as well as via distributional statistics of generated sketches. We also examine the model s performance on conditional synthesis tasks.
Researcher Affiliation Academia Ari Seff1, Wenda Zhou2,3, Nick Richardson1, & Ryan P. Adams1 1Princeton University, 2New York University, 3Flatiron Institute {aseff,njkrichardson,rpa}@princeton.edu {wz2247}@nyu.edu
Pseudocode No No explicit pseudocode blocks or sections labeled 'Algorithm' were found in the paper.
Open Source Code Yes For code and pre-trained models, see https://lips.cs.princeton.edu/vitruvion.
Open Datasets Yes Our models are trained on the Sketch Graphs dataset (Seff et al., 2020), which consists of several million CAD sketches collected from Onshape.
Dataset Splits Yes We randomly divide the filtered collection of sketches into a 92.5% training, 2.5% validation, and 5% testing partition.
Hardware Specification Yes Training is performed on a server with four Nvidia V100 GPUs, and our models take between three and six hours to train. ... Training was performed on cluster servers equipped with four Nvidia V100 GPUs.
Software Dependencies No The paper mentions using the Adam optimizer and specific learning rate schedules, but does not provide specific version numbers for any software dependencies like programming languages or libraries (e.g., Python, PyTorch, TensorFlow).
Experiment Setup Yes Appendix E is titled 'EXPERIMENTAL DETAILS'. Within this, 'Our models all share a main transformer responsible for processing the sequence of primitives or constraints which is the target of the inference problem. This transformer architecture is identical across all of the models, using a standard transformer decoder with 12 layers, 8 attention heads, and a total embedding dimension of 256. ... The models were trained using the Adam optimizer with decoupled weight decay regularization (Loshchilov & Hutter, 2019), with the learning rate set according to a one-cycle learning rate schedule (Smith & Topin, 2018). The initial initial learning rate was set to 3e-5 (at reference batch size 128, scaled linearly with the total batch size). The batch size was set according to the memory usage of the different models at 1024 / GPU for the raw primitive model, 512 / GPU for the image-to-primitive model, and 384 / GPU for the constraint model. ... The primitive model and the constraint model were each trained for 30 epochs, and the image-to-primitive model was trained for 40 epochs'