reproducibilityindex.ai

Hierarchical Policy Search via Return-Weighted Density Estimation

Authors: Takayuki Osa, Masashi Sugiyama

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental Results To visualize the performance, we ﬁrst evaluate HPSDE in toy problems and the puddle world task where the return functions are multi-modal. Subsequently, we show the experiments with the motion planning task for a robotic manipulator, which is a practical application of hierarchical RL.
Researcher Affiliation	Academia	Takayuki Osa University of Tokyo 277-0882, Chiba, Japan RIKEN Center for AIP 103-0027, Tokyo, Japan Masashi Sugiyama RIKEN Center for AIP 103-0027, Tokyo, Japan University of Tokyo 277-0882, Chiba, Japan
Pseudocode	Yes	Algorithm 1 Hierarchical Policy Search via Return Weighted Density Estimation (HPSDE)
Open Source Code	No	The paper does not provide any statements about code release, nor does it include links to a source code repository.
Open Datasets	No	The paper discusses task setups such as the 'puddle world task' and 'motion planning for a redundant manipulator' in a 'simulation environment, developed based on VREP', but it does not mention using or providing access to any publicly available or open datasets with proper citations or links.
Dataset Splits	No	The paper does not provide specific details regarding training, validation, or test dataset splits, such as percentages, absolute sample counts, or citations to predefined splits.
Hardware Specification	No	The paper mentions running simulations and modeling a 'KUKA Light Weight Robot', but it does not provide any specific details about the hardware (e.g., CPU, GPU models, cloud instance types) used to conduct these experiments.
Software Dependencies	No	The paper mentions various software components and methods such as 'VREP', 'DMPs', 'Gaussian Process (GP)', 'REPS', 'RWR', and 'Hi REPS', but it does not provide specific version numbers for any of these software dependencies.
Experiment Setup	Yes	For this task, we used a linear feature function φ(s) = [s , 1] and set Omax = 10 for HPSDE. [...] We set Omax = 20 for HPSDE.