Attentive Experience Replay

Authors: Peiquan Sun, Wengang Zhou, Houqiang Li5900-5907

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We couple AER with different off-policy algorithms and demonstrate that AER makes consistent improvements on the suite of Open AI gym tasks.
Researcher Affiliation Academia Peiquan Sun, Wengang Zhou, Houqiang Li CAS Key Laboratory of Technology in GIPAS, EEIS Department, University of Science and Technology of China spq@mail.ustc.edu.cn, {zhwg, lihq}@ustc.edu.cn
Pseudocode Yes Algorithm 1 Attentive experience replay
Open Source Code No The paper does not provide concrete access to source code for the methodology described, nor does it mention a repository link or state that code is available in supplementary materials.
Open Datasets Yes We compare AER with two ER methods, the uniform sampling and the prioritized experience replay (PER) (Schaul et al. 2016), on the suite of Open AI gym tasks (Figure 1) (Brockman et al. 2016).
Dataset Splits No The paper describes evaluation procedures ('evaluated every 5000 steps, by running the policy deterministically and cumulative rewards are averaged over 50 evaluation episodes'), but does not specify static training/validation/test dataset splits in the conventional supervised learning sense, as is common in reinforcement learning environments.
Hardware Specification Yes All experiments are performed on a server with 40 Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.4GHz processors and 8 Ge Force GTX-1080 Ti 12 GB GPU.
Software Dependencies No The paper mentions software components like 'Adam optimizer' but does not provide specific version numbers for software dependencies such as libraries or frameworks.
Experiment Setup Yes For deep actor-critic algorithms (SAC, TD3 and DDPG), both policy-network and valuenetwork are represented using MLP with two hidden layers (256, 256) and optimized using Adam (Kingma and Ba 2014) with learning rare of 3 10 4. The replay buffer size is 10^6 and the sampling mini-batch size is 256.