Lyapunov Density Models: Constraining Distribution Shift in Learning-Based Control

Authors: Katie Kang, Paula Gradu, Jason J Choi, Michael Janner, Claire Tomlin, Sergey Levine

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present a method of using an LDM in model-based RL as a constraint on the model optimizer, and evaluate this method empirically. (Introduction) In this section, we present an experimental evaluation of our method. (Section 9)
Researcher Affiliation Academia 1University of California, Berkeley. Correspondence to: Katie Kang <katiekang@eecs.berkeley.edu >.
Pseudocode Yes Algorithm 1 LDM Training [practical algorithm] (Section 6)
Open Source Code No The project webpage can be found at: https://sites.google.com/berkeley.edu/ldm/. This link points to a project overview page, not a specific code repository for the methodology.
Open Datasets Yes We conduct our evaluation on two RL benchmark environments, hopper and lunar lander (Brockman et al., 2016), and a medical application, Sim Glucose (Xie, 2018).
Dataset Splits No The paper describes the total size of the datasets used for training, but it does not specify explicit train/validation/test splits with percentages or counts for reproducing the data partitioning. The evaluation is conducted in the RL environments directly.
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments, such as CPU or GPU models, or memory specifications. It generally refers to training deep neural networks.
Software Dependencies No The paper mentions software components like "neural spline flow" and "Adam" optimizer, but it does not provide specific version numbers for these or other key software dependencies (e.g., Python, PyTorch/TensorFlow versions) required for replication.
Experiment Setup Yes Appendix D provides detailed experimental setup information, including hyperparameters for flow model training (e.g., "learning rate", "coupling transform MLP"), LDM training (e.g., "batch size", "architecture", "learning rate", "CQL(H)"), dynamics model training, and MPC parameters ("sampling prior", "num random actions", "horizon"). These are presented in tables with specific values.