Improving Computational Efficiency in Visual Reinforcement Learning via Stored Embeddings
Authors: Lili Chen, Kimin Lee, Aravind Srinivas, Pieter Abbeel
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments, we show that SEER does not degrade the performance of RL agents while significantly saving computation and memory across a diverse set of Deep Mind Control environments and Atari games. |
| Researcher Affiliation | Collaboration | Lili Chen1 Kimin Lee1 Aravind Srinivas2 Pieter Abbeel1 1UC Berkeley 2Open AI |
| Pseudocode | No | The paper does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | Yes | See Appendices ?? and ?? for more hyperparameters and Appendix ?? for source code. |
| Open Datasets | Yes | We first demonstrate the compute-efficiency of SEER on the Deep Mind Control Suite (DMControl; Tassa et al. 45) and Atari games [2] benchmarks. |
| Dataset Splits | No | The paper refers to using standard benchmarks (Deep Mind Control Suite and Atari games) but does not explicitly state the specific training, validation, and test dataset splits used for reproduction. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions software like PyTorch in the acknowledgements and refers to various algorithms/architectures, but it does not specify version numbers for any key software components or libraries. |
| Experiment Setup | Yes | For all experiments, we use the hyperparameters and architecture of data-efficient Rainbow [48]. For SEER, we freeze the first fully-connected layer in CURL experiments and the last convolutional layer of the encoder in Rainbow experiments. We present the best results across various values of the encoder freezing time Tf. See Appendices ?? and ?? for more hyperparameters and Appendix ?? for source code. |