Stabilizing Off-Policy Deep Reinforcement Learning from Pixels
Authors: Edoardo Cetin, Philip J Ball, Stephen Roberts, Oya Celiktutan
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the effectiveness of A-LIX in pixel-based reinforcement learning tasks in two popular and distinct domains featuring a diverse set of continuous and discrete control problems. |
| Researcher Affiliation | Academia | 1Centre for Robotics Research, Department of Engineering, King s College London 2Department of Engineering Science, University of Oxford. |
| Pseudocode | No | The paper describes its methods but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | We open-source our code to facilitate reproducibility and future extensions1. 1https://github.com/Aladoro/Stabilizing-Off-Policy-RL |
| Open Datasets | Yes | We evaluate the effectiveness of A-LIX for pixel-based RL on continuous control tasks from the Deep Mind Control Suite (DMC) (Tassa et al., 2018). ... We make use of the popular Atari Learning Environment (ALE) (Bellemare et al., 2013)... |
| Dataset Splits | Yes | In Table 4, we show the performance in each of the evaluated 15 DMC environments by reporting the mean and standard deviations over the cumulative returns obtained midway and at the end of training for the medium and hard benchmark tasks, respectively. |
| Hardware Specification | No | The paper mentions 'support from Toyota Motor Corporation contributed towards funding the utilized computational resources,' but does not provide specific details such as GPU/CPU models, memory, or clock speeds used for the experiments. |
| Software Dependencies | No | The paper mentions 'Optimizer Adam (Kingma & Ba, 2014)' in Tables 6 and 7, but does not provide specific version numbers for software dependencies such as the Adam implementation or the deep learning framework used (e.g., PyTorch, TensorFlow). |
| Experiment Setup | Yes | In Tables 6 and 7 we provide the full list of hyperparameters used in our implementations for DMC and Atari 100k, respectively. |