Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Efficient Biological Data Acquisition through Inference Set Design

Authors: Ihor Neporozhnii, Julien Roy, Emmanuel Bengio, Jason Hartford

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical studies on image and molecular datasets, as well as a real-world large-scale biological assay, show that active learning for inference set design leads to significant reduction in experimental cost while retaining high system performance.
Researcher Affiliation	Collaboration	1Valence Labs 2University of Toronto 3University of Manchester
Pseudocode	Yes	A pseudo-code is available in Appendix B.
Open Source Code	Yes	The code is available at https://github.com/ineporozhnii/inference_set_design. All datasets to reproduce our results are publicly available, except one proprietary dataset for the results in Figure 8.
Open Datasets	Yes	The whole MNIST training set is used as the target set from which agents can acquire samples. The MNIST test set is split 50-50 into a validation set used for early stopping and a test set used for measuring model performance on held-out data inaccessible by agents. We use the Quantum Machine 9 (QM9) (Ruddigkeit et al., 2012; Ramakrishnan et al., 2014). For our experiments, we start by using the publicly available Rx Rx3 dataset (Fay et al., 2023). To evaluate the inference set design paradigm on a regression task we use the Molecules3D dataset (Xu et al., 2021).
Dataset Splits	Yes	Both datasets are split into inference, validation, and test sets with 80%, 5%, 15% fractions.
Hardware Specification	No	The paper does not provide specific hardware details used for running its experiments. It mentions 'HTS platforms' but this is a general term and not a specific hardware specification (e.g., GPU/CPU models, memory details).
Software Dependencies	Yes	As a first data processing step, we use the RDKit (Landrum et al., 2024) and Molfeat (Noutahi et al., 2023) libraries to convert molecular structures into SMILES strings and compute their Extended Connectivity Fingerprints (ECFPs).
Experiment Setup	Yes	Hyperparameters for experiments. Table 2: Hyperparameters for experiments.