reproducibilityindex.ai

Amortized Inference for Causal Structure Learning

Authors: Lars Lorch, Scott Sussex, Jonas Rothfuss, Andreas Krause, Bernhard Schölkopf

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On synthetic data and semisynthetic gene expression data, our models exhibit robust generalization capabilities when subject to substantial distribution shifts and significantly outperform existing algorithms, especially in the challenging genomics domain. Our code and models are publicly available at: https://github.com/larslorch/avici.
Researcher Affiliation	Academia	Lars Lorch ETH Zurich Zurich, Switzerland llorch@ethz.ch Scott Sussex ETH Zurich Zurich, Switzerland ssussex@ethz.ch Jonas Rothfuss ETH Zurich Zurich, Switzerland rojonas@ethz.ch Andreas Krause ETH Zurich Zurich, Switzerland krausea@ethz.ch Bernhard Schölkopf MPI for Intelligent Systems Tübingen, Germany bs@tuebingen.mpg.de
Pseudocode	Yes	Algorithm 1 Training the inference model fϕ
Open Source Code	Yes	Our code and models are publicly available at: https://github.com/larslorch/avici.
Open Datasets	Yes	In Appendix E, we additionally report results on a real-world proteomics dataset (Sachs et al., 2005). In addition to SCMs, we consider the challenging domain of GRNs (GRN) using the simulator of Dibaeinia and Sinha (2020). In the GRN domain, we use subgraphs of the known S. cerevisiae and E. coli GRNs and their effect signs whenever known. To extract these subgraphs, we use the procedure by Marbach et al. (2009).
Dataset Splits	No	The paper mentions training data and unseen test data, but does not specify a separate validation split with percentages or counts.
Hardware Specification	Yes	All experiments ran on a private cluster with Intel Xeon E5-2630 v4 CPUs and NVIDIA Tesla V100 GPUs.
Software Dependencies	No	The paper mentions software like JAX and Haiku, but does not provide specific version numbers for these or other libraries used.
Experiment Setup	Yes	We train all models for 300k steps with a batch size of 20. We use the Adam optimizer (Kingma and Ba, 2015) with an initial learning rate of 10−4 that is linearly warmed up for 10k steps and then decayed proportionally to the inverse square root of the step count. We set τ to 1.0 during training and decrease it to 0.1 during evaluation and when plotting calibration curves. The dual variable λ is initialized to 1.0 and updated every 10 steps with a step size η = 0.01. We use a hidden dimension of k = 128 for the embeddings and the feed-forward networks, and 8 attention heads in each layer. Dropout (Srivastava et al., 2014) is applied before each residual connection with a rate of 0.1 for LINEAR and RFF and 0.2 for GRN.