Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Preference-Guided Diffusion for Multi-Objective Offline Optimization
Authors: Yashas Annadani, Syrine Belakaria, Stefano Ermon, Stefan Bauer, Barbara Engelhardt
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach on various continuous offline multi-objective optimization tasks and find that it consistently outperforms other inverse/generative approaches while remaining competitive with forward/ surrogate-based optimization methods. |
| Researcher Affiliation | Academia | Yashas Annadani 1,3 Syrine Belakaria2 Stefano Ermon2 Stefan Bauer1,3 Barbara Engelhardt2,4 1 TU Munich 2 Stanford University 3 Helmholtz AI, Munich 4 Gladstone Institutes |
| Pseudocode | Yes | Algorithm 1 Sampling from Preference Guided Diffusion |
| Open Source Code | Yes | Correspondence to EMAIL. Code available at https://github.com/yannadani/pgd_moo. |
| Open Datasets | Yes | Our evaluation closely follows the benchmarking effort provided in prior work [45]. We evaluate our approach on two sets of tasks: synthetic and real-world applications-based RE engineering suite [40]. Each task consists of a dataset of 60k offline datapoints. As in [45], we use 54k randomly chosen data points for training and the remaining for validation. |
| Dataset Splits | Yes | Each task consists of a dataset of 60k offline datapoints. As in [45], we use 54k randomly chosen data points for training and the remaining for validation. |
| Hardware Specification | Yes | All the experiments are run on an NVIDIA A100 GPU. |
| Software Dependencies | No | The paper mentions using Adam W optimizer [29] and Adam optimizer [20], and describes network architectures, but does not specify software library versions (e.g., PyTorch 1.x, TensorFlow 2.x, or specific Python versions). |
| Experiment Setup | Yes | We parameterize the unconditional denoising model to be a multi-layer perceptron (MLP) with two 512-dimensional hidden layers, followed by a Re LU nonlinearity and layer normalization [26]. We also incorporate sinusoidal time embedding [43] for conditioning. We parameterize the preference model to be an MLP with three hidden layers, with first two hidden layers having the same number of units as the input, while the last hidden layer is having 512 units. Similar to denoising model, we also use Re LU nonlinearity followed by layer normalization and sinusoidal time embedding. The denoising model is trained with Adam W optimizer [29] with learning rate of 5e 4 for up to 200 epochs. Following Ho et al. [17], we employ a linear noise schedule such that the noise Îēt grows linearly from 1e 4 to 0.02. The preference model is trained with Adam optimizer [20] with learning rate of 1e 5 for up to 500 epochs. During sampling, we set the guidance weight w to 10. |