Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Diversity By Design: Leveraging Distribution Matching for Offline Model-Based Optimization

Authors: Michael S Yao, James Gee, Osbert Bastani

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments spanning multiple scientific domains show that Dyn AMO can be used with common optimization methods to significantly improve the diversity of proposed designs while still discovering high-quality candidates. We evaluate Dyn AMO on a set of six real-world offline MBO tasks spanning multiple scientific domains and both discrete and continuous search spaces.
Researcher Affiliation	Academia	1University of Pennsylvania, Philadelphia, PA, USA. Correspondence to: Michael Yao <EMAIL>.
Pseudocode	Yes	Algorithm 1 (Dyn AMO). Diversity in Adversarial Modelbased Optimization
Open Source Code	Yes	Our custom code implementation for our experiments is made publicly available at github.com/michael-s-yao/Dyn AMO.
Open Datasets	Yes	All datasets used in our experiments are publicly available; offline datasets associated with Design-Bench tasks are made available by Trabucco et al. (2022). The offline dataset for the Molecule task is made available by Brown et al. (2019).
Dataset Splits	No	All our optimization tasks include an offline, static dataset D = {(xi, r(xi))}n i=1 of previously observed designs and their corresponding objective values. We first use D to train a task-specific forward surrogate model rθ with parameters θ according to (2).
Hardware Specification	Yes	All experiments were run for 10 random seeds on a single internal cluster with 8 NVIDIA RTX A6000 GPUs. Of note, all Dyn AMO experiments were run using only a single GPU.
Software Dependencies	No	We again use an Adam optimizer with a learning rate of η = 3 10 4 for both the VAE and the forward surrogate. sobol sequence (Sobol, 1967) using the official Py Torch quasi-random generator Sobol Engine implementation. While PyTorch is mentioned, specific version numbers for PyTorch or Adam (or its underlying library) are not provided.
Experiment Setup	Yes	We parameterize each forward surrogate model rθ(x) as a fully connected neural network with two hidden layers of size 2048 and Leaky ReLU activations, trained using an Adam optimizer with a learning rate of η = 0.0003 for 100 epochs. Finally, we fix the KL-divergence weighting β = 1.0, temperature hyperparameter τ = 1.0, and constraint bound W0 = 0 for all experiments to avoid overfitting Dyn AMO to any particular task or optimizer.