APS: Active Pretraining with Successor Features
Authors: Hao Liu, Pieter Abbeel
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | When evaluated on the Atari 100k data-efficiency benchmark, our approach significantly outperforms previous methods combining unsupervised pretraining with task-specific finetuning. |
| Researcher Affiliation | Academia | 1University of California, Berkeley, CA, USA. |
| Pseudocode | Yes | Algorithm 1: Training APS |
| Open Source Code | No | No, the paper does not provide a link to open-source code or explicitly state that the code is available. |
| Open Datasets | Yes | We evaluate our approach on the Atari benchmark (Bellemare et al., 2013) where we apply APS to Dr Q (Kostrikov et al., 2020) and test its performance after fine-tuning for 100K supervised environment steps. |
| Dataset Splits | No | No, the paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) for training, validation, or test sets. |
| Hardware Specification | No | No, the paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | No, the paper mentions frameworks and prior works it builds upon (e.g., Dr Q, Adam optimizer, pycolab game engine) but does not provide specific version numbers for software dependencies. |
| Experiment Setup | Yes | We largely follow Hansen et al. (2020) for hyperparameters used in our Atari experiments, with the following three exceptions. [...] We use the Adam optimizer (Kingma & Ba, 2015) with an learning rate 0.0001. We use discount factor γ = .99. Standard batch size of 32. ψ is coupled with a target network (Mnih et al., 2015), with an update period of 100 updates. |