Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Can Increasing Input Dimensionality Improve Deep Reinforcement Learning?
Authors: Kei Ota, Tomoaki Oiki, Devesh Jha, Toshisada Mariyama, Daniel Nikovski
ICML 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through numerical experiments, we show that the proposed method outperforms several other state-of-the-art algorithms in terms of both sample ef๏ฌciency and performance. In this section, we try to answer the following questions with our experiments to describe the performance of OFENet. |
| Researcher Affiliation | Industry | 1Mitsubishi Electric Corporation, Kanagawa, Japan 2Mitsubishi Electric Research Laboratory, Cambridge, USA. |
| Pseudocode | Yes | Algorithm 1 outlines this procedure. The psuedo-code for the proposed method is presented in Algorithm 1. |
| Open Source Code | Yes | Codes for the proposed method are available at http://www.merl.com/research/license/OFENet. |
| Open Datasets | Yes | All these experiments are done in Mu Jo Co simulation environment. Figure 4 shows the learning curves on Open AI Gym tasks. |
| Dataset Splits | Yes | To measure the auxiliary score, we collect 100K transitions as a training set and 20K transitions as a test set, using a random policy. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions software like Mu Jo Co, Open AI Gym, and the Adam optimizer, but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | The SAC agent is trained with the hyper-parameters described in (Haarnoja et al., 2018), where the networks have two hidden layers which have 256 units. All the networks are trained with mini-batches of size 256 and Adam optimizer, with a learning rate 3 10 4. |