Continual Auxiliary Task Learning
Authors: Matthew McLeod, Chunlok Lo, Matthew Schlegel, Andrew Jacobsen, Raksha Kumaraswamy, Martha White, Adam White
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct an in-depth study into the resulting multi-prediction learning system. We conduct the experiment in a TMaze environment... |
| Researcher Affiliation | Academia | Department of Computing Science, University of Alberta {mmcleod2,chunlok,mkschleg,ajjacobs,kumarasw}@ualberta.ca Martha White, Adam White Department of Computing Science, University of Alberta CIFAR Canada AI Chair, Alberta Machine Intelligence Institute (Amii) |
| Pseudocode | Yes | Algorithm 1 Multi-Prediction Learning System |
| Open Source Code | No | The paper does not provide any explicit statement or link regarding the availability of open-source code for the described methodology. |
| Open Datasets | No | The paper describes experiments in a TMaze environment and Mountain Car, which are typically environments rather than specific datasets with explicit access information or citations. |
| Dataset Splits | No | The paper describes experiments in reinforcement learning environments but does not specify train/validation/test dataset splits, as it concerns interaction with an environment rather than pre-defined datasets. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions algorithms like Tree-Backup and Expected Sarsa, and a stepsize method called Auto, but does not provide specific version numbers for any software dependencies or programming languages. |
| Experiment Setup | Yes | Both use λ = 0.9 and a stepsize method called Auto [Mahmood et al., 2012] designed for online learning. We sweep the initial stepsize and meta stepsizes for Auto. For further details about the agents and optimizer, see Appendix D. |