Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Amortized Active Generation of Pareto Sets

Authors: Daniel M Steinberg, Asiri Wijesinghe, Rafael Oliveira, Piotr Koniusz, Cheng Soon Ong, Edwin V. Bonilla

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results on synthetic benchmarks and protein design tasks demonstrate strong sample efﬁciency and effective preference incorporation.
Researcher Affiliation	Academia	1CSIRO s Data61 2University of New South Wales 3Australian National University
Pseudocode	Yes	Algorithm 1 A-GPS optimization loop.
Open Source Code	Yes	For code implementing A-GPS, VSD and all of the experimental results, please see github.com/csiro-funml/variationalsearch.
Open Datasets	Yes	Empirical results on synthetic benchmarks and protein design tasks demonstrate strong sample efﬁciency and effective preference incorporation. ... Ehrlich synthetic landscape [37] with a Prot Bert [18] naturalness score. ... bi-grams optimization task from [38] ... simulation-based protein stability vs. solvent accessible surface area (SASA) task from [38].
Dataset Splits	Yes	All methods use 64 training points, and then recommend B = 5 candidates for T = 10 rounds... All methods are given 128 training samples... We start with 512 random sequences... 512 training samples, T = 64, B = 16... Train prior with a 10% validation set...
Hardware Specification	Yes	All experiments were run on a Dell Power Edge XE9640 rack server cluster with NVIDIA H100 GPUs and 4th generation Intel Xeon CPUs.
Software Dependencies	No	The paper mentions 'Bo Torch [4]' and 'poli and poli-baselines libraries [20]' but does not provide specific version numbers for these software dependencies.
Experiment Setup	Yes	For all experiments we set β = 0.5 as the full KL regularization in Equation 15 can hamper exploitation in later rounds on some tasks. We refer the reader to Appendix D for full experimental details. ... Table 4: Synthetic test functions experimental settings. ... Table 6: Sequence experimental settings.