Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Probabilistic Factorial Experimental Design for Combinatorial Interventions
Authors: Divya Shyamal, Jiaqi Zhang, Caroline Uhler
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments to validate our theoretical results, as well as show a comparison to fractional factorial design, using simulated data. |
| Researcher Affiliation | Academia | 1Department of Electrical Engineering and Computer Science, MIT 2Department of Mathematics, MIT 3Eric and Wendy Schmidt Center, Broad Institute. Correspondence to: Jiaqi Zhang <EMAIL>, Caroline Uhler <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Active probabilistic factorial experimental design. |
| Open Source Code | Yes | Code can be found at the linked repository. |
| Open Datasets | No | We generate the outcome model f by sampling the Fourier coefficients from the uniform distribution, i.e., ̖ U( 1, 1)K. (The paper explicitly states the data is simulated, not from a public dataset. Therefore, no concrete access information for a public dataset is provided.) |
| Dataset Splits | No | We generate the outcome model f by sampling the Fourier coefficients from the uniform distribution, i.e., ̖ U( 1, 1)K. (The paper uses simulated data, and thus does not discuss predefined train/test/validation splits of a public dataset.) |
| Hardware Specification | Yes | Experiments were run on a device with a 16 core Intel Core Ultra 7 165H processor with 32 GB RAM, and an NVIDIA RTX 4000 Mobile Ada Generation 12 GB GPU. |
| Software Dependencies | Yes | The code is implemented in Python, utilizing the cupy and numba libraries, among others. The active design optimization was done using scipy SLSQP solver. (The reference Virtanen et al., 2020 specifies 'Scipy 1.0', indicating a version for the scipy library.) |
| Experiment Setup | Yes | We generate the outcome model f by sampling the Fourier coefficients from the uniform distribution, i.e., ̖ U( 1, 1)K. We noise the outcomes with standard Gaussian noise. In each of the following simulations, we keep ̖ constant through all iterations of each run. The curves are generated with values p = 10, k = 2, n = 200; p = 20, k = 2, n = 1000; and p = 30, k = 2, n = 1000. |