reproducibilityindex.ai

Model-Free Trajectory Optimization for Reinforcement Learning

Authors: Riad Akrour, Gerhard Neumann, Hany Abdulsamad, Abbas Abdolmaleki

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experimental section demonstrates that on tasks with highly non-linear dynamics MOTO outperforms similar methods that rely on a linearization of these dynamics. Additionally, it is shown on a simulated Robot Table Tennis Task that MOTO is able to scale to high dimensional tasks while keeping the sample complexity relatively low; amenable to a direct application to a physical system.
Researcher Affiliation	Academia	Riad Akrour1 AKROUR@IAS.TU-DARMSTADT.DE Abbas Abdolmaleki3 ABBAS.A@UA.PT Hany Abdulsamad2 ABDULSAMAD@IAS.TU-DARMSTADT.DE Gerhard Neumann1 NEUMANN@IAS.TU-DARMSTADT.DE 1: CLAS, 2: IAS, TU Darmstadt, Darmstadt, Germany 3: IEETA, University of Aveiro, Aveiro, Portugal
Pseudocode	Yes	Algorithm 1 Model-Free Trajectory Optimization (MOTO)
Open Source Code	No	The paper does not provide any explicit statements or links to open-source code for the described methodology.
Open Datasets	No	The paper uses simulated environments like "multi-link swing-up tasks" and "simulated Robot Table Tennis Task". It does not provide access information (links, DOIs, formal citations) for these simulated environments or any external datasets used in a publicly available or open manner.
Dataset Splits	No	The paper describes using "M rollouts" for sampling and discusses sample reuse, but it does not specify explicit train/validation/test dataset splits (e.g., percentages, sample counts, or predefined external splits) for reproducibility.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU or CPU models, memory specifications, or cloud computing instance types used for running the experiments.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers (e.g., programming languages, libraries, or frameworks).
Experiment Setup	Yes	Input: Initial policy π0, number of trajectories per iteration M, step-size ϵ and entropy reduction rate β0... The number of rollouts per iteration is reduced to M = 20.