Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Constraining Generative Models for Engineering Design with Negative Data

Authors: Lyle Regenwetter, Giorgio Giannone, Akash Srivastava, Dan Gutfreund, Faez Ahmed

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our benchmarks showcase both the best-in-class performance of our new NDGM formulation and the overall dominance of NDGMs versus classic generative models. We publicly release the code and benchmarks at https://github.com/Lyleregenwetter/NDGMs. We evaluate our model on an expansive set of benchmarks including specially-constructed test problems, authentic engineering tasks featuring real-world constraints from engineering standards, and a final high-dimensional topology optimization study.
Researcher Affiliation	Collaboration	Lyle Regenwetter EMAIL Massachusetts Institute of Technology Giorgio Giannone EMAIL Amazon Massachusetts Institute of Technology Akash Srivastava EMAIL MIT-IBM Watson AI Lab Dan Gutfreund EMAIL MIT-IBM Watson AI Lab Faez Ahmed EMAIL Massachusetts Institute of Technology
Pseudocode	Yes	B Pseudocode Pseudocode for our GAN-MDD training formulation is shown below: Algorithm 1 GAN-MC Training Procedure
Open Source Code	Yes	Our benchmarks showcase both the best-in-class performance of our new NDGM formulation and the overall dominance of NDGMs versus classic generative models. We publicly release the code and benchmarks at https://github.com/Lyleregenwetter/NDGMs.
Open Datasets	Yes	We have curated a benchmark of a dozen diverse engineering tasks, which are discussed in detail in Appendix F. ... Ashby Chart: Taken from (Jetton et al., 2023)... Bike Frame: The FRAMED dataset (Regenwetter et al., 2022b)... Ship Hull: The SHIPD Dataset (Bagazinski & Ahmed, 2023)...
Dataset Splits	No	F.1.1 Dataset Details Problem 1: Datapoints are randomly sampled from one of six modes... Sampling is performed until 5K positive samples and 5K negative samples are acquired... Problem 2: Datapoints are uniformly sampled... Sampling is performed until 5K positive samples and 5K negative samples are acquired... F.2.1 Dataset Details For all datasets in this section, optimization objectives are not utilized. ... 1K positive samples and 1K negative samples are selected using uniform random sampling. ... F.3.1 Dataset Details The GAN was trained exclusively on 32436 valid (connected) topologies... The GAN-MDD and GAN-DO models are trained on a medley of disconnected topologies generated by iterative optimization (2564), and either procedurally-generated synthetic topologies (35000) or GAN-generated disconnected topologies (92307).
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running its experiments.
Software Dependencies	No	The paper mentions the "Adam optimizer (Kingma & Ba, 2014)" as an algorithm used, but does not list specific software libraries or their version numbers like PyTorch, TensorFlow, or Python versions.
Experiment Setup	Yes	F.1.2 Training Details All tested networks (encoder, decoder, generator, discriminator, DDPM noise model, auxiliary discriminator) are deep networks with one hidden layer of 400 neurons and Re LU activations. A batch size of 256 is used throughout. Models are trained using the Adam optimizer (Kingma & Ba, 2014) with a learning rate 3 10 4. Models are trained for 10000 epochs. The noise dimension for the GAN and latent dimension for the VAE are set at 8. Diversity weights are set at 0.1.