Unsupervised Video Object Segmentation for Deep Reinforcement Learning
Authors: Vikash Goel, Jameson Weng, Pascal Poupart
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Sec. 5 evaluates the approach empirically on 59 Atari games. Finally, Sec. 6 concludes the paper and discusses possible future extensions. |
| Researcher Affiliation | Academia | Vik Goel, Jameson Weng, Pascal Poupart Cheriton School of Computer Science, Waterloo AI Institute, University of Waterloo, Canada Vector Institute, Toronto, Canada {v5goel,jj2weng,ppoupart}@uwaterloo.ca |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/vik-goel/MOREL. |
| Open Datasets | Yes | We showcase the performance of MOREL on all 59 Atari games where we observe a notable improvement in comparison to A2C and PPO for 26 and 25 games respectively, and a worse performance for 3 and 9 games respectively. |
| Dataset Splits | No | The paper mentions training on Atari games and conducting an ablation study but does not explicitly provide details about training, validation, or test dataset splits. |
| Hardware Specification | No | The paper mentions the use of 'Cry SP RIPPLE Facility at the University of Waterloo' but does not provide specific hardware details such as GPU/CPU models or memory amounts used for experiments. |
| Software Dependencies | No | The paper mentions optimizers and algorithms but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We collect 100k frames by following a random policy. Using an Adam optimizer [18] with learning rate 1 × 10−4 and batch size 16, we minimize Ltotal for 250k steps. Following the experimental setup from [24], we train each agent for 10 million timesteps with one timestep for each frame. |