Active Vision Reinforcement Learning under Limited Visual Observability
Authors: Jinghuan Shang, Michael S Ryoo
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through a series of experiments, we show the effectiveness of our method across a range of observability conditions and its adaptability to existed RL algorithms. |
| Researcher Affiliation | Academia | Jinghuan Shang Michael S. Ryoo Department of Computer Science, Stony Brook University {jishang, mryoo}@cs.stonybrook.edu |
| Pseudocode | No | The paper describes the algorithm modifications for DQN and SAC using textual descriptions and mathematical equations, but it does not provide any structured pseudocode blocks or figures explicitly labeled 'Algorithm' or 'Pseudocode'. |
| Open Source Code | Yes | Our project page, code, and library are available at this link |
| Open Datasets | Yes | Our library currently supports active vision agent on Robosuite [114], a robot manipulation environment in 3D, as well as Atari games [9] and the Deep Mind Control Suite (DMC) [97] offering 2D active vision cases. |
| Dataset Splits | No | The paper specifies training duration in terms of transitions ('Each agent is trained with one million transitions for each of the 26 Atari games, or trained with 0.1 million transitions for each of the 6 DMC tasks and 5 Robosuite tasks') and evaluates using 'IQM of raw rewards from 30 evaluations', but it does not provide explicit dataset splits (e.g., 80/10/10% train/validation/test) for reproduction. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, or memory specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions backbone algorithms like DQN, SAC, and Dr Qv2, and environments such as Robosuite, Atari, and Deep Mind Control Suite. However, it does not provide specific version numbers for software dependencies like Python, PyTorch, or other libraries used for implementation. |
| Experiment Setup | Yes | Details on architectures and hyperparameters can be found in the Appendix. Appendix A.2 'Hyper-parameter Settings' provides tables (Table 6, 7, 8, 9) with specific values for total steps, replay buffer size, learning rates, batch size, update frequencies, and other algorithm-specific parameters for DQN, SAC, and Dr Qv2. |