reproducibilityindex.ai

IPO: Interior-Point Policy Optimization under Constraints

Authors: Yongshuai Liu, Jiaxin Ding, Xin Liu4940-4947

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive evaluations to compare our approach with state-of-the-art baselines. Our algorithm outperforms the baseline algorithms, in terms of reward maximization and constraint satisfaction.
Researcher Affiliation	Academia	Yongshuai Liu, Jiaxin Ding, Xin Liu University of California, Davis {yshliu, jxding, xinliu}@ucdavis.edu
Pseudocode	Yes	Algorithm 1 The procedure of IPO
Open Source Code	No	No explicit statement or link providing concrete access to the source code for the described methodology was found.
Open Datasets	Yes	We conduct experiments and compare IPO with CPO and PDO in various scenarios: three tasks in the Mujoco simulator (Point-Gather, Point-Circle (Achiam et al. 2017), Half Cheetah-Safe (Chow et al. 2019)) and a grid-world task (Mars-Rover) inspired by (Chow et al. 2015).
Dataset Splits	No	The paper describes sampling N trajectories and running experiments multiple times with different random seeds, but does not provide specific dataset split information (e.g., percentages or counts for training, validation, and test sets) in the conventional supervised learning sense.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running experiments were provided in the paper.
Software Dependencies	No	The paper mentions software components like PPO, TRPO, Adam optimizer, and Mujoco simulator, but does not provide specific version numbers for any of them.
Experiment Setup	No	While the paper discusses hyperparameters like the PPO clip rate 'r', logarithmic barrier hyperparameter 't', and learning rates, it does not provide concrete numerical values for these specific parameters used in the main experimental setup, but rather discusses their tuning or ranges.