Dimension-Wise Importance Sampling Weight Clipping for Sample-Efficient Reinforcement Learning

Authors: Seungyul Han, Youngchul Sung

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical results show that the proposed new algorithm outperforms PPO and other RL algorithms in various Open AI Gym tasks.
Researcher Affiliation Academia 1School of Electrical Engineering, KAIST, Daejeon, South Korea. Correspondence to: Youngchul Sung <ycsung@kaist.ac.kr>.
Pseudocode Yes Algorithm 1 DISC
Open Source Code Yes The source code for DISC is available at http://github.com/seungyulhan/disc/.
Open Datasets Yes We evaluate our algorithm on various Open AI GYM tasks (Brockman et al., 2016)
Dataset Splits No The paper discusses training on environments and reusing old sample batches but does not explicitly provide training/validation/test dataset splits.
Hardware Specification No The paper does not provide any specific details regarding the hardware used for running experiments (e.g., GPU/CPU models, memory).
Software Dependencies No The paper mentions using Open AI Gym tasks and baselines but does not list specific software dependencies with version numbers (e.g., programming languages, libraries, frameworks).
Experiment Setup Yes Detailed description of the hyper-parameters of PPO, PPOAMBER and DISC is provided in Table A.1 in Appendix.