Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Do Finetti: On Causal Effects for Exchangeable Data
Authors: Siyuan Guo, Chi Zhang, Karthika Mohan, Ferenc Huszar, Bernhard Schölkopf
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we develop an algorithm that performs simultaneous causal discovery and effect estimation given multi-environment data. empirically validate our results in Section 5. 5 Experiments We construct synthetic datasets according to causal Pólya urn model (cf. Section 3.2) and demonstrate that Do-Finetti algorithm can estimate causal effects and graphs simultaneously. |
| Researcher Affiliation | Collaboration | 1Max Planck Institute for Intelligent Systems 2Toyota Research Institute 3 Oregon State University 4 University of Cambridge |
| Pseudocode | Yes | See Algorithm 1 in Appendix I for details of the procedure. |
| Open Source Code | Yes | We included detailed experimental setup descriptions in the appendix and also provided reproducible code in supplementary materials. |
| Open Datasets | No | We construct synthetic datasets according to causal Pólya urn model (cf. Section 3.2) The data-generating process for X Y , for example, as follows: θe Beta(α, β), ψe Beta(α, β) X Y : Xe i := Ber(θe), Y e i := Ber(ψe) Xe i |
| Dataset Splits | No | The paper describes generating synthetic data and running experiments over a varying number of environments but does not specify explicit training, validation, or test dataset splits, percentages, or methodology for partitioning the data. |
| Hardware Specification | No | The experiments can be reproduced using single laptop with CPUs with a time estimate within 5 minutes. |
| Software Dependencies | No | The paper states 'The code is building on top of Guo et al. [2023a] under license CC-BY 4.0.' but does not list specific software dependencies with version numbers (e.g., Python, PyTorch, or other libraries). |
| Experiment Setup | Yes | The data-generating process for X Y , for example, as follows: θe Beta(α, β), ψe Beta(α, β) X Y : Xe i := Ber(θe), Y e i := Ber(ψe) Xe i where denotes xor operation and Xe i , Y e i denotes variable generated at i-th position in environment e and set α = 1, β = 3. We repeat the experiment for 100 times and report the mean squared error loss between predicted and analytic solutions across varying number of environments. |