Amortized Active Causal Induction with Deep Reinforcement Learning
Authors: Yashas Annadani, Panagiotis Tigas, Stefan Bauer, Adam Foster
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On synthetic data and a single-cell gene expression simulator, we demonstrate empirically that the data acquired through our policy results in a better estimate of the underlying causal graph than alternative strategies. |
| Researcher Affiliation | Academia | Yashas Annadani1,2 Panagiotis Tigas3 Stefan Bauer1,2 Adam Foster 1 Helmholtz AI, Munich 2 Technical University of Munich 3 OATML, University of Oxford |
| Pseudocode | No | The paper does not contain explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions making use of publicly available repositories and code for the single-cell gene simulator, the AVICI model, and baselines (Diff CBED), but does not explicitly state that *their* code for the CAASL methodology is open source or provide a link within the paper content itself. |
| Open Datasets | Yes | On synthetic data and the single-cell gene expression simulator SERGIO [17], we empirically study various aspects of our trained policy... The simulator of Dibaeinia and Sinha [17] is publicly available under GPL-3.0 license. |
| Dataset Splits | No | SAC related hyperparameters are tuned based on performance on held-out design environments. This implies a validation set, but specific split percentages or sample counts for these 'held-out design environments' are not provided. |
| Hardware Specification | Yes | We train all models on 3 40GB NVIDIA A100 GPU accelerators. |
| Software Dependencies | No | The paper mentions using the 'Adam optimizer [33]' but does not provide specific version numbers for any software dependencies like programming languages, libraries, or frameworks (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | Training Details. We train CAASL with 4 layers of alternating attention for the transformer, followed by a max pooling operation over the history, to give an embedding with size l = 32. SAC related hyperparameters are tuned based on performance on held-out design environments. Details of the architecture, hyperparameter tuning and optimizer is given in Appendix D. Table 2: Hyperparameters used for training in CAASL. No. attention layers (for policy, Q-Function) 4, No. attention heads (for policy, Q-Function) 8, Dropout (Policy) 0.1, Hidden sizes (for policy and Q) (128, 128), Policy LR {0.01, 0.001}, Q-Function LR {3e-5, 3e-6}. |