What Can Learned Intrinsic Rewards Capture?
Authors: Zeyu Zheng, Junhyuk Oh, Matteo Hessel, Zhongwen Xu, Manuel Kroiss, Hado Van Hasselt, David Silver, Satinder Singh
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present the results from our empirical investigations in two sections. We investigate these research questions in the grid-world domains illustrated in Figure 2. |
| Researcher Affiliation | Collaboration | 1University of Michigan 2Deep Mind. Correspondence to: Zeyu Zheng <zeyu@umich.edu>, Junhyuk Oh <junhyuk@google.com>. |
| Pseudocode | Yes | Algorithm 1 Learning intrinsic rewards |
| Open Source Code | No | No explicit statement about providing open-source code or a link to a repository was found in the paper. |
| Open Datasets | No | We investigate these research questions in the grid-world domains illustrated in Figure 2. For each domain, we trained an intrinsic reward function across many lifetimes and evaluated it by training an agent using the learned reward. |
| Dataset Splits | No | No explicit mention of traditional training, validation, or test dataset splits (e.g., percentages or counts) was found, as the experiments involve interactive learning within simulated environments over lifetimes and episodes. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments were mentioned in the main paper. |
| Software Dependencies | No | No specific software dependencies with version numbers were explicitly mentioned in the main text of the paper. |
| Experiment Setup | No | The details of implementation and hyperparameters are described in the supplementary material. |