Amortized Active Causal Induction with Deep Reinforcement Learning

Authors: Yashas Annadani, Panagiotis Tigas, Stefan Bauer, Adam Foster

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On synthetic data and a single-cell gene expression simulator, we demonstrate empirically that the data acquired through our policy results in a better estimate of the underlying causal graph than alternative strategies.
Researcher Affiliation Academia Yashas Annadani1,2 Panagiotis Tigas3 Stefan Bauer1,2 Adam Foster 1 Helmholtz AI, Munich 2 Technical University of Munich 3 OATML, University of Oxford
Pseudocode No The paper does not contain explicit pseudocode or algorithm blocks.
Open Source Code No The paper mentions making use of publicly available repositories and code for the single-cell gene simulator, the AVICI model, and baselines (Diff CBED), but does not explicitly state that *their* code for the CAASL methodology is open source or provide a link within the paper content itself.
Open Datasets Yes On synthetic data and the single-cell gene expression simulator SERGIO [17], we empirically study various aspects of our trained policy... The simulator of Dibaeinia and Sinha [17] is publicly available under GPL-3.0 license.
Dataset Splits No SAC related hyperparameters are tuned based on performance on held-out design environments. This implies a validation set, but specific split percentages or sample counts for these 'held-out design environments' are not provided.
Hardware Specification Yes We train all models on 3 40GB NVIDIA A100 GPU accelerators.
Software Dependencies No The paper mentions using the 'Adam optimizer [33]' but does not provide specific version numbers for any software dependencies like programming languages, libraries, or frameworks (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes Training Details. We train CAASL with 4 layers of alternating attention for the transformer, followed by a max pooling operation over the history, to give an embedding with size l = 32. SAC related hyperparameters are tuned based on performance on held-out design environments. Details of the architecture, hyperparameter tuning and optimizer is given in Appendix D. Table 2: Hyperparameters used for training in CAASL. No. attention layers (for policy, Q-Function) 4, No. attention heads (for policy, Q-Function) 8, Dropout (Policy) 0.1, Hidden sizes (for policy and Q) (128, 128), Policy LR {0.01, 0.001}, Q-Function LR {3e-5, 3e-6}.