Learning Control-Oriented Dynamical Structure from Data
Authors: Spencer M. Richards, Jean-Jacques Slotine, Navid Azizan, Marco Pavone
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On a variety of simulated nonlinear dynamical systems, we empirically demonstrate the efficacy of learned versions of this controller in stable trajectory tracking. Alongside our learning method, we evaluate recent ideas in jointly learning a controller and stabilizability certificate for known dynamical systems; we show experimentally that such methods can be frail in comparison. |
| Researcher Affiliation | Academia | 1Autonomous Systems Laboratory (ASL), Stanford University, Stanford, CA 94305, USA 2Nonlinear Systems Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA 3Laboratory for Information & Decision Systems (LIDS), Massachusetts Institute of Technology, Cambridge, MA 02139, USA. |
| Pseudocode | No | The paper describes the proposed methods in detail but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | We provide code to reproduce all of our results at: https://github.com/Stanford ASL/ Learning-Control-Oriented-Structure. |
| Open Datasets | No | For each system, we begin by uniformly sampling points {(x(i), u(i))}N i=1 from a bounded state-control set X U Rn Rm, and evaluating the true dynamics to form the labelled data D. This indicates the data was generated for the paper, not taken from a publicly accessible dataset with a specific link or citation provided. |
| Dataset Splits | Yes | Training is performed for 50000 epochs while the loss on a held-out validation set is monitored; for each method, the model parameters corresponding to the lowest validation loss are chosen for testing. Training is performed for 50000 epochs while the loss on a held-out validation set of size 0.10N is monitored, where N is the size of the labelled training data set. |
| Hardware Specification | No | The paper states that neural networks are used and mentions Python and JAX, but it does not specify any particular hardware components such as GPU or CPU models, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions using Python, JAX, CasADi, and Ipopt solver, but it does not provide specific version numbers for these software components, which are necessary for reproducible dependency descriptions. |
| Experiment Setup | Yes | Each function in (f, B, M, K, A0, {Aj}m j=1) is approximated as a feedforward neural network with two hidden layers and 128 hidden tanh activation units per layer. We use the Adam optimizer (Kingma & Ba, 2015) with a learning rate of 10 3 and otherwise default hyperparameters. Training is performed for 50000 epochs while the loss on a held-out validation set of size 0.10N is monitored. For the CCM-based learning method, since Equation (18) is homogeneous in M(x), we choose λ = 0.1 without loss of generality. Additionally, we fix the overshoot α = 10 and the decay rate β = 0.5 in the auxiliary loss Equation (27). For both the CCM and SDC learning methods, we use N CCM aux = N SDC aux = 10000 unlabelled samples. |