DySLIM: Dynamics Stable Learning by Invariant Measure for Chaotic Systems
Authors: Yair Schiff, Zhong Yi Wan, Jeffrey B. Parker, Stephan Hoyer, Volodymyr Kuleshov, Fei Sha, Leonardo Zepeda-Núñez
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We use our framework to propose a tractable and sample efficient objective that can be used with any existing learning objectives. Our Dynamics Stable Learning by Invariant Measure (Dy SLIM) objective enables model training that achieves better point-wise tracking and long-term statistical accuracy relative to other learning objectives. By targeting the distribution with a scalable regularization term, we hope that this approach can be extended to more complex systems exhibiting slowly-variant distributions, such as weather and climate models. Code to reproduce our experiments is available here. |
| Researcher Affiliation | Collaboration | 1Department of Computer Sciences, Cornell Tech, New York, NY, USA 2Google Research, Mountain View, CA, USA 3Department of Mathematics, University of Wisconsin-Madison, WI, USA. Correspondence to: Yair Schiff <yairschiff@cs.cornell.edu>, Leonardo Zepeda-N u Nez <lzepedanunez@google.com>. |
| Pseudocode | No | The paper describes mathematical formulations and procedural steps for its method but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Code to reproduce our experiments is available here. |
| Open Datasets | Yes | The Lorenz 63 model (Lorenz, 1963) is defined on a 3-dimensional state space by the following non-linear ordinary differential equation... Training and evaluation data were generated using a 4th order Runge-Kutta numerical integrator... We generate 5,000 training trajectories each of length 100,000 steps... The KS system is known to be chaotic (Papageorgiou & Smyrlis, 1991)... We generate data for this system using a spectral solver (Dresdner et al., 2022)... The training dataset consists of 800 trajectories of 1,200 steps... We also consider the Navier-Stokes equation with Kolmogorov forcing... We repeated the process 128 times to obtain the training data, and 32 times for both the validation and test data. |
| Dataset Splits | Yes | Starting from initial conditions sampled from µ , we generate 5,000 training trajectories each of length 100,000 steps and 20,000 test trajectories of length 1,000,000 steps. At training and evaluation time these trajectories are down-sampled along the temporal dimension by a factor of 400, so that the effective time scale was t = 0.4... The training dataset consists of 800 trajectories of 1,200 steps... Our evaluation set consists of 100 trajectories of length 1,000 steps... For each trajectory, we let the solver warm up for 50 units of time... We repeated the process 128 times to obtain the training data, and 32 times for both the validation and test data. |
| Hardware Specification | Yes | Table 9. Computational resources by experiment. Experiment Hardware Lorenz 63 1 V100 GPU, 16 GB Kuramoto Sivashinsky 1 V100/A100 GPU, 16/40GB Kolmogorov Flow 1 A100 GPU, 40GB |
| Software Dependencies | No | Table 8 lists software used (e.g., Flax, Jax, Jax-CFD, NumPy) along with their licenses and a citation to the project's paper or repository, but it does not specify exact version numbers for these software libraries, which are necessary for reproducible dependency information (e.g., 'Flax 0.6.1' instead of just 'Flax (Heek et al., 2023)'). |
| Experiment Setup | Yes | Table 4. Model, learning rate, and number of training steps for each experiment in Section 5. System Sθ LR Training steps Lorenz 63 MLP w/residual connection to input 1e 4 500k Kuramoto Sivashinsky Dilated convolutional network (Stachenfeld et al., 2022) 5e 4 300k Kolmogorov Flow Dilated convolutional network (Stachenfeld et al., 2022) 5e 4 720k |