Passive learning of active causal strategies in agents and language models
Authors: Andrew Lampinen, Stephanie Chan, Ishita Dasgupta, Andrew Nam, Jane Wang
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We then show empirically that agents trained via imitation on expert data can indeed generalize at test time to infer and use causal links which are never present in the training data; these agents can also generalize experimentation strategies to novel variable sets never observed in training. We then show that strategies for causal intervention and exploitation can be generalized from passive data even in a more complex environment with high-dimensional observations, with the support of natural language explanations. Explanations can even allow passive learners to generalize out-of-distribution from otherwise perfectly-confounded training data. Finally, we show that language models, trained only on passive next-word prediction, can generalize causal intervention strategies from a few-shot prompt containing examples of experimentation, together with explanations and reasoning. |
| Researcher Affiliation | Collaboration | Andrew K. Lampinen Google Deep Mind London, UK lampinen@deepmind.com Stephanie C. Y. Chan Google Deep Mind London, UK scychan@deepmind.com Ishita Dasgupta Google Deep Mind London, UK idg@deepmind.com Andrew J. Nam Stanford University Stanford, CA ajhnam@stanford.edu Jane X. Wang Google Deep Mind London, UK wangjane@deepmind.com |
| Pseudocode | No | No, the paper does not contain structured pseudocode or algorithm blocks. Procedures are described in natural language within the text. |
| Open Source Code | No | No, the paper does not provide concrete access to its own source code for the methodology described. It mentions using existing open-source libraries like JAX and Haiku, but does not state that the code for their specific implementation or models is being released. |
| Open Datasets | No | No, the paper does not provide concrete access information for a publicly available or open dataset that was used for training. It mentions that 'The training data are generated from a subset of the possible causal DAGs' and refers to 'expert data' and 'odd-one-out environments' which were seemingly generated for this specific research without providing public access details. |
| Dataset Splits | No | No, the paper does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, and testing. It mentions training and evaluation (test) sets, but lacks the granular details required for full reproducibility of the data partitioning. |
| Hardware Specification | No | No, the paper does not provide specific hardware details (exact GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running its experiments. It mentions using the '70 billion parameter Chinchilla LM' but not the hardware on which their experiments were performed. |
| Software Dependencies | No | No, the paper does not provide specific ancillary software details with version numbers. While it mentions several tools and models like 'Transformer XL', 'Chinchilla LM', 'JAX', and 'Haiku' with citations, it does not specify the exact version numbers for these software components used in their experiments. |
| Experiment Setup | No | No, the paper does not contain comprehensive specific experimental setup details such as concrete hyperparameter values (e.g., learning rate, batch size, number of epochs) for the agent training. While it mentions some settings for LM inference (e.g., nucleus sampling with p=0.8, T=1), these are not general training setup details. |