Offline RL with Discrete Proxy Representations for Generalizability in POMDPs
Authors: Pengjie Gu, Xinyu Cai, Dong Xing, Xinrun Wang, Mengchen Zhao, Bo An
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments to evaluate ORDER, showcasing its effectiveness in offline RL for diverse partially observable scenarios and highlighting the significance of discrete proxy representations in generalization performance. We conduct all experiments with five distinct random seeds, each consisting of ten separate runs. |
| Researcher Affiliation | Collaboration | Pengjie Gu 1, , Xinyu Cai 1, , Dong Xing2, Xinrun Wang 1, , Mengchen Zhao 3, , Bo An1 School of Computer Science and Engineering, Nanyang Technological University, Singapore1 College of Computer Science and Technology, Zhejiang University3 Noah s Ark Lab, Huawei3 |
| Pseudocode | No | The paper describes the training stages and architectural components but does not provide a formal pseudocode block or algorithm figure. |
| Open Source Code | No | The paper does not include any explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We evaluate ORDER and other baseline algorithms on gym locomotion tasks and maze navigation tasks in the D4RL benchmark [12] under different partial observation situations. |
| Dataset Splits | No | The paper mentions using the D4RL benchmark and conducting experiments, but it does not specify the exact training, validation, and test dataset splits (e.g., percentages or sample counts) used for reproducibility. |
| Hardware Specification | No | The paper does not specify any hardware details such as GPU models, CPU types, or memory used for running the experiments. |
| Software Dependencies | No | The paper mentions using specific algorithms like IQL [28] and VQ-VAE [45], but it does not provide specific version numbers for these or any other software dependencies (e.g., Python, PyTorch, TensorFlow, etc.) that would be necessary for reproducibility. |
| Experiment Setup | No | The paper states: |