Distral: Robust multitask reinforcement learning

Authors: Yee Teh, Victor Bapst, Wojciech M. Czarnecki, John Quan, James Kirkpatrick, Raia Hadsell, Nicolas Heess, Razvan Pascanu

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that our approach supports efficient transfer on complex 3D environments, outperforming several related methods.
Researcher Affiliation Industry Deep Mind London, UK
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks with clear labels.
Open Source Code No The paper does not provide concrete access to source code, such as a specific repository link or an explicit code release statement.
Open Datasets No The paper uses custom environments (grid world, 3D mazes, navigation, laser-tag) but does not provide access information (link, DOI, citation with authors/year) for these or any other public datasets.
Dataset Splits No The paper discusses learning curves and performance but does not specify exact percentages or sample counts for training, validation, or test dataset splits.
Hardware Specification No The paper mentions a 'distributed Python/Tensor Flow code base' but does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for experiments.
Software Dependencies No The paper mentions 'Python/Tensor Flow' but does not specify version numbers for these or any other ancillary software components.
Experiment Setup Yes We tried three values for the entropy costs β and three learning rates . Four runs for each hyperparameter setting were used. All other hyperparameters were fixed to the single-task A3C defaults and, for the KL+ent 1col and KL+ent 2col algorithms, was fixed at 0.5.