Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Granger causal inference on DAGs identifies genomic loci regulating transcription
Authors: Alexander P Wu, Rohit Singh, Bonnie Berger
ICLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We applied Gr ID-Net on multimodal single-cell assays that profile chromatin accessibility (ATAC-seq) and gene expression (RNA-seq) in the same cell and show that it dramatically outperforms existing methods for inferring regulatory locus gene links, achieving up to 71% greater agreement with independent population genetics-based estimates. |
| Researcher Affiliation | Academia | 1 Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA 02139, USA 2 Department of Mathematics, MIT, Cambridge, MA 02139, USA EMAIL |
| Pseudocode | No | The paper describes the model architecture using mathematical equations (Eqn. 4, 5, 6, 7) and textual descriptions but does not provide a formal pseudocode block or algorithm listing. |
| Open Source Code | Yes | The code for Gr ID-Net is available at https://github.com/alexw16/gridnet. |
| Open Datasets | Yes | We analyzed three single-cell multimodal datasets that characterize a range of dynamic processes, including cancer drug response and cellular differentiation (Cao et al., 2018; Chen et al., 2019; Ma et al., 2020). |
| Dataset Splits | No | The paper describes training details like learning rate, epochs, and minibatch size for the Gr ID-Net models. It also mentions comparing full and reduced models using an F-test. However, it does not explicitly define or specify standard train/validation/test splits of the datasets (e.g., percentages or counts for each subset) that are typically used to evaluate model generalization. |
| Hardware Specification | Yes | All models were implemented in Py Torch and trained on a single NVIDIA Tesla V100 GPU. |
| Software Dependencies | No | The paper states, “All models were implemented in Py Torch” and mentions using “Homer” and “statsmodels package,” but it does not specify version numbers for any of these software dependencies. |
| Experiment Setup | Yes | Gr ID-Net models were trained using the Adam optimizer with a learning rate of 0.001 for 20 epochs or until convergence (defined to be the point at which the relative change in the loss function is less than 0.1/|P| across consecutive epochs). A minibatch size of 1024 candidate peak gene pairs was used during training, and trainable parameters in the model were initialized using Glorot initialization (Bengio & Glorot, 2010). All Gr ID-Net models consisted of L = 10 GNN layers; the architectures of the three sub-models ( h(reduced) y , h(full) y , and h(full) x ) were identical but separate. |