reproducibilityindex.ai

Efficient Dynamics Modeling in Interactive Environments with Koopman Theory

Authors: Arnab Kumar Mondal, Siba Smarak Panigrahi, Sai Rajeswar, Kaleem Siddiqi, Siamak Ravanbakhsh

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results in offline-RL datasets demonstrate the effectiveness of our approach for reward and state prediction over a long horizon.
Researcher Affiliation	Collaboration	Arnab Kumar Mondal Mila, Mc Gill University Siba Smarak Panigrahi Mila, Mc Gill University Sai Rajeswar Service Now Research Kaleem Siddiqi Mila, Mc Gill University Siamak Ravanbakhsh Mila, Mc Gill University
Pseudocode	Yes	Algorithm 1 Diagonal Koopman Dynamics model
Open Source Code	Yes	Our code can be found in https://github.com/arnab39/koopman-dynamica.
Open Datasets	Yes	For the forward dynamics modeling experiments, we use the D4RL Fu et al. (2020) dataset, which is a popular offline-RL environment.
Dataset Splits	No	The paper states 'We divide the dataset of 1M samples into 80:20 splits for training and testing, respectively.', but does not mention a separate validation split or how it's handled if implicitly used (e.g., during hyperparameter tuning).
Hardware Specification	Yes	Each iteration consists of one gradient update of the entire model using a mini-batch of 256 in A100 GPU.
Software Dependencies	No	The paper provides code snippets in Jax and Flax (Appendix J) but does not specify version numbers for these or other software dependencies.
Experiment Setup	Yes	To train the dynamics model, we randomly sample trajectories of length τ from the training data, where τ is the horizon specified during training. We test our learned dynamics model for a horizon length of 100 by randomly sampling 50,000 trajectories of length 100 from the test set.