reproducibilityindex.ai

Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary Dynamics

Authors: Xinyu Zhang, Wenjie Qiu, Yi-Chen Li, Lei Yuan, Chengxing Jia, Zongzhang Zhang, Yang Yu

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental evaluation across six benchmark Mu Jo Co tasks with variable parameters demonstrates that DORA not only achieves a more precise dynamics encoding but also significantly outperforms existing baselines in terms of performance. 5. Experiments In this section, we conduct the experiments to answer the following questions:
Researcher Affiliation	Collaboration	1National Key Laboratory for Novel Software Technology, Nanjing University, China 2School of Artificial Intelligence, Nanjing University, China 3Polixir Technologies.
Pseudocode	Yes	To sum up, the pseudocodes of training and testing are illustrated in Algorithm 1 and Appendix B, respectively.
Open Source Code	Yes	To sum up, the pseudocodes of training and testing are illustrated in Algorithm 1 and Appendix B, respectively. We release the code at Github2. 2https://github.com/Xinyuz26/DORA
Open Datasets	Yes	We choose Mu Jo Co tasks for experiments, including Half Cheetah-v3, Walker2d-v3, Hopper-v3, and Inverted Double Pendulum-v2, which are common benchmarks in offline RL (Todorov et al., 2012).
Dataset Splits	No	Each environment contains 10 tasks for training and 10 tasks for testing for both IID, OOD, and non-stationary dynamics. No explicit mention of a separate validation split or dataset.
Hardware Specification	No	The paper does not provide specific details on the hardware used, such as GPU/CPU models or memory specifications.
Software Dependencies	No	The paper mentions using a GRU network and a linear layer for parameterization in Section C.5, but it does not specify any software names with version numbers for libraries like PyTorch, TensorFlow, or specific Python versions, which are crucial for reproducibility.
Experiment Setup	Yes	The paper includes Table 4 'Configurations and hyper-parameters used in offline encoder training', which details specific values such as 'Debias loss weight', 'Distortion loss weight', 'History length', 'Latent space dim', 'Batch size', 'Learning rate', 'Training steps', and 'Radius of radius basis function' for various environments.