Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Efficient Bayesian Experiment Design with Equivariant Networks

Authors: Conor Igoe, Tejus Gupta, Jeff G. Schneider

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments aim to answer the following questions: 1. Do GNNs offer an effective inductive bias for efficiently learning BED policies using data from an expensive oracle? 2. Can GNNs be used to learn non-myopic BED policies using reinforcement learning? 3. How well do GNN-based policies transfer across BED problems, and do they scale to larger search problems?
Researcher Affiliation	Academia	Conor Igoe Machine Learning Department Carnegie Mellon University EMAIL Tejus Gupta Robotics Institute Carnegie Mellon University EMAIL Jeff Schneider Robotics Institute Carnegie Mellon University EMAIL
Pseudocode	No	The paper describes theoretical proofs and mathematical formulations (e.g., Bellman equation), but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks with structured, code-like steps.
Open Source Code	No	Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [No] Justification: We have not provided a full codebase for this paper. However, our contribution is in articulating a specific problem independent of specific code implementation, and our results rely on open source widely used GNN, Transformer, CNN and FCN models.
Open Datasets	No	The paper refers to 'Bayesian Optimization (BO)' and 'Active Search (AS)' tasks, which are problem formulations, and describes their characteristics and specific parameters (e.g., '1D domain with M = 32 grid points', 'zero-mean Gaussian prior with a chosen kernel'). It does not specify the use of any well-known public datasets, provide links, DOIs, or formal citations for data sources that would allow public access to raw experimental data.
Dataset Splits	Yes	We train these networks on an 80:20 train:test split of data from D using Adam, with varying dataset size \|D\| {50, 500, 5000, 50000} (full results presented in the appendix).
Hardware Specification	Yes	B Compute Resources All experiments in this paper were conducted on a cluster of 8 NVIDIA 2080 Ti GPUs. The longest-running experiments involved training the 10 Transformer model seeds on the 8D Bayesian Optimization continuous behavior cloning task, which took approximately one week.
Software Dependencies	No	The paper mentions using 'Adam' as an optimizer and 'DDQN' for reinforcement learning, as well as 'Bo Torch' for Bayesian Optimization. However, it does not specify version numbers for any of these software components or libraries, which is necessary for reproducibility.
Experiment Setup	Yes	Table 10: Behavior Cloning Hyperparameters Learning Rate 3e-4 Batch Size 32 Optimizer Adam Table 11: Reinforcement Learning Hyperparameters Algorithm DDQN Exploration ϵ-greedy Discount factor 0.95 Learning Rate 3e-4 Batch Size 32 Optimizer Adam