reproducibilityindex.ai

PD-MORL: Preference-Driven Multi-Objective Reinforcement Learning Algorithm

Authors: Toygun Basaklar, Suat Gumussoy, Umit Ogras

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This section extensively evaluates the proposed PD-MORL technique using commonly used MORL benchmarks with discrete state-action spaces (Section 5.1) and complex MORL environments with continuous state-action spaces (Section 5.2).
Researcher Affiliation	Collaboration	Toygun Basaklar UW-Madison Madison, WI 53706 basaklar@wisc.edu Suat Gumussoy Siemens Corporate Technology Princeton, NJ 08540 suat.gumussoy@siemens.com Umit Y. Ogras UW-Madison Madison, WI 53706 uogras@wisc.edu
Pseudocode	Yes	Algorithm 1: Preference Driven MO-DDQN-HER
Open Source Code	Yes	The source code is attached with the rest of the supplementary material, providing a complete description of the multi-objective RL environments and instructions on reproducing our experiments.
Open Datasets	Yes	We first evaluate PD-MORL s performance on two commonly used discrete MORL benchmarks: Deep Sea Treasure (Hayes et al., 2022) and Fruit Tree Navigation (Yang et al., 2019).
Dataset Splits	No	The paper describes the benchmarks used (e.g., Deep Sea Treasure, Fruit Tree Navigation, MO-Walker2d-v2) but does not provide explicit details about how these datasets were split into training, validation, or test sets (e.g., specific percentages or sample counts) for their experiments.
Hardware Specification	Yes	We run all our experiments on a local server including Intel Xeon Gold 6242R.
Software Dependencies	No	The paper mentions using radial basis function interpolation and cites 'SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python, 2020' but does not explicitly list specific software dependencies with their version numbers.
Experiment Setup	Yes	Table 4: Hyperparameters for MO-DDQN-HER