Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Probabilistic Factorial Experimental Design for Combinatorial Interventions

Authors: Divya Shyamal, Jiaqi Zhang, Caroline Uhler

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments to validate our theoretical results, as well as show a comparison to fractional factorial design, using simulated data.
Researcher Affiliation	Academia	1Department of Electrical Engineering and Computer Science, MIT 2Department of Mathematics, MIT 3Eric and Wendy Schmidt Center, Broad Institute. Correspondence to: Jiaqi Zhang <EMAIL>, Caroline Uhler <EMAIL>.
Pseudocode	Yes	Algorithm 1 Active probabilistic factorial experimental design.
Open Source Code	Yes	Code can be found at the linked repository.
Open Datasets	No	We generate the outcome model f by sampling the Fourier coefficients from the uniform distribution, i.e., ̖ U( 1, 1)K. (The paper explicitly states the data is simulated, not from a public dataset. Therefore, no concrete access information for a public dataset is provided.)
Dataset Splits	No	We generate the outcome model f by sampling the Fourier coefficients from the uniform distribution, i.e., ̖ U( 1, 1)K. (The paper uses simulated data, and thus does not discuss predefined train/test/validation splits of a public dataset.)
Hardware Specification	Yes	Experiments were run on a device with a 16 core Intel Core Ultra 7 165H processor with 32 GB RAM, and an NVIDIA RTX 4000 Mobile Ada Generation 12 GB GPU.
Software Dependencies	Yes	The code is implemented in Python, utilizing the cupy and numba libraries, among others. The active design optimization was done using scipy SLSQP solver. (The reference Virtanen et al., 2020 specifies 'Scipy 1.0', indicating a version for the scipy library.)
Experiment Setup	Yes	We generate the outcome model f by sampling the Fourier coefficients from the uniform distribution, i.e., ̖ U( 1, 1)K. We noise the outcomes with standard Gaussian noise. In each of the following simulations, we keep ̖ constant through all iterations of each run. The curves are generated with values p = 10, k = 2, n = 200; p = 20, k = 2, n = 1000; and p = 30, k = 2, n = 1000.