Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Dimension-Wise Importance Sampling Weight Clipping for Sample-Efficient Reinforcement Learning

Authors: Seungyul Han, Youngchul Sung

ICML 2019 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical results show that the proposed new algorithm outperforms PPO and other RL algorithms in various Open AI Gym tasks.
Researcher Affiliation Academia 1School of Electrical Engineering, KAIST, Daejeon, South Korea. Correspondence to: Youngchul Sung <EMAIL>.
Pseudocode Yes Algorithm 1 DISC
Open Source Code Yes The source code for DISC is available at http://github.com/seungyulhan/disc/.
Open Datasets Yes We evaluate our algorithm on various Open AI GYM tasks (Brockman et al., 2016)
Dataset Splits No The paper discusses training on environments and reusing old sample batches but does not explicitly provide training/validation/test dataset splits.
Hardware Specification No The paper does not provide any specific details regarding the hardware used for running experiments (e.g., GPU/CPU models, memory).
Software Dependencies No The paper mentions using Open AI Gym tasks and baselines but does not list specific software dependencies with version numbers (e.g., programming languages, libraries, frameworks).
Experiment Setup Yes Detailed description of the hyper-parameters of PPO, PPOAMBER and DISC is provided in Table A.1 in Appendix.