Learning Pareto Set for Multi-Objective Continuous Robot Control

Authors: Tianye Shu, Ke Shang, Cheng Gong, Yang Nan, Hisao Ishibuchi

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compare our method with two state-of-the-art MORL algorithms on seven multi-objective continuous robot control problems. Experimental results show that our method achieves the best overall performance with the least training parameters.
Researcher Affiliation Academia 1Department of Computer Science and Engineering, Southern University of Science and Technology 2National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University 3Department of Computer Science, City University of Hong Kong
Pseudocode Yes Algorithm 1: Learning Pareto Set via Policy Gradient
Open Source Code Yes All codes of Hyper-MORL are available from https://github.com/Hisao Lab SUSTC/Hyper-MORL.
Open Datasets Yes To test the performance of Hyper-MORL, we use seven problems in Table 1 from a multi-objective robot control benchmark suite [Xu et al., 2020].
Dataset Splits No The paper does not explicitly provide specific training/validation/test dataset splits with percentages, sample counts, or citations to predefined splits for reproducibility.
Hardware Specification No The paper mentions training time but does not provide specific hardware details such as CPU/GPU models, memory, or cloud instance specifications used for running the experiments.
Software Dependencies No The paper mentions using PPO and ADAM but does not provide specific version numbers for these or other key software components or libraries.
Experiment Setup Yes The optimizer is ADAM with learning rate η = 5 10 5. In Hyper-MORL, the parameters α and d are set as 0.15 (i.e., 15%) and 10, respectively. The number of sampled preferences K is set as 6 and 15 for two-objective and three-objective problems, respectively.