Can Increasing Input Dimensionality Improve Deep Reinforcement Learning?

Authors: Kei Ota, Tomoaki Oiki, Devesh Jha, Toshisada Mariyama, Daniel Nikovski

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through numerical experiments, we show that the proposed method outperforms several other state-of-the-art algorithms in terms of both sample efficiency and performance. In this section, we try to answer the following questions with our experiments to describe the performance of OFENet.
Researcher Affiliation Industry 1Mitsubishi Electric Corporation, Kanagawa, Japan 2Mitsubishi Electric Research Laboratory, Cambridge, USA.
Pseudocode Yes Algorithm 1 outlines this procedure. The psuedo-code for the proposed method is presented in Algorithm 1.
Open Source Code Yes Codes for the proposed method are available at http://www.merl.com/research/license/OFENet.
Open Datasets Yes All these experiments are done in Mu Jo Co simulation environment. Figure 4 shows the learning curves on Open AI Gym tasks.
Dataset Splits Yes To measure the auxiliary score, we collect 100K transitions as a training set and 20K transitions as a test set, using a random policy.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments, such as GPU or CPU models.
Software Dependencies No The paper mentions software like Mu Jo Co, Open AI Gym, and the Adam optimizer, but does not provide specific version numbers for these software dependencies.
Experiment Setup Yes The SAC agent is trained with the hyper-parameters described in (Haarnoja et al., 2018), where the networks have two hidden layers which have 256 units. All the networks are trained with mini-batches of size 256 and Adam optimizer, with a learning rate 3 10 4.