Offline Constrained Multi-Objective Reinforcement Learning via Pessimistic Dual Value Iteration

Authors: Runzhe Wu, Yufeng Zhang, Zhuoran Yang, Zhaoran Wang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We also conduct numerical experiments to verify our theory. Please see Appendix I for details. The results show that the proposed algorithm is not only provably efficient but also applicable.
Researcher Affiliation Academia Runzhe Wu Shanghai Jiao Tong University runzhe@sjtu.edu.cn Yufeng Zhang Northwestern University yufengzhang2023@u.northwestern.edu Zhuoran Yang Princeton University zy6@prince ton.edu Zhaoran Wang Northwestern University zhaoranwang@gmail.com
Pseudocode Yes Algorithm 1 Pessimistic planning. and Algorithm 2 Pessimistic Dual Iteration (PEDI).
Open Source Code No The paper does not provide an explicit statement or link to the open-source code for the described methodology.
Open Datasets No The paper refers to 'a dataset D = {(sτ h, aτ h, cτ h)}H,N h,τ=1 with N trajectories collected a priori by an experimentor' but does not provide specific access information like a link, DOI, or formal citation for this dataset.
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, or testing.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup No The main text of the paper does not contain specific experimental setup details (concrete hyperparameter values, training configurations, or system-level settings). It refers to 'Appendix I for details' for numerical experiments, implying these details are not in the main body.