FLEX: an Adaptive Exploration Algorithm for Nonlinear Systems
Authors: Matthieu Blanke, Marc Lelarge
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We run several experiments to validate our method. Our code and a demonstration video are available at https://github.com/MB-29/exploration. More details about the experiments can be found in Appendix B. |
| Researcher Affiliation | Academia | Matthieu Blanke 1 Marc Lelarge 1 1INRIA, DI ENS, PSL Research University. Correspondence to: Matthieu Blanke <matthieu.blanke@inria.fr>. |
| Pseudocode | Yes | Algorithm 1 Active exploration Algorithm 2 Flexible Exploration (FLEX) |
| Open Source Code | Yes | Our code and a demonstration video are available at https://github.com/MB-29/exploration. |
| Open Datasets | Yes | We experiment on the pendulum and the cartpole of the Deep Mind control suite (Tunyasuvunakool et al., 2020) |
| Dataset Splits | No | The paper discusses training models and evaluating them but does not provide specific train/validation/test split percentages, sample counts, or detailed splitting methodology. |
| Hardware Specification | No | The paper mentions measuring computational time 'on a laptop' but does not provide specific details about the hardware components (e.g., CPU, GPU models, memory). |
| Software Dependencies | No | The paper mentions using 'Pytorch (Paszke et al., 2017)' and the 'Adam optimizer (Kingma & Ba, 2015)', as well as the 'mpc package and the i LQR algorithm for exploitation (Amos et al., 2018)', but it does not specify version numbers for these software components. |
| Experiment Setup | Yes | The learning models include various degrees of prior knowledge on the dynamics. A linear model is used for the pendulum, and neural networks are used for the other environments. The Jacobians of Proposition 2 are computed using automatic differentiation. At each time step, the model is evaluated with (2.3) computed over a fixed grid. neural networks... of width 8 with one hidden layer and tanh nonlinearity trained using ADAM optimizer... with a batch size of 100. specific learning rates and cost functions are provided for different environments. |