Boosting for Control of Dynamical Systems
Authors: Naman Agarwal, Nataly Brukhim, Elad Hazan, Zhou Lu
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evaluation on a host of control settings supports our theoretical findings. |
| Researcher Affiliation | Collaboration | 1Google AI Princeton 2Department of Computer Science, Princeton University. Correspondence to: <namanagarwal@google.com, {nbrukhim,ehazan,zhoul}@princeton.edu>. |
| Pseudocode | Yes | Algorithm 1 Dyna Boost 1; Algorithm 2 Dyna Boost 2 |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code for the methodology described. |
| Open Datasets | No | The paper describes experiments within simulated environments (Linear Dynamical Systems, Inverted Pendulum, etc.) where data is generated by the simulation rather than from a pre-existing publicly available dataset. While some environments like OpenAI Gym are open, they are environments for simulation and not datasets in the typical sense that require specific access information. |
| Dataset Splits | No | The paper conducts experiments within simulated environments and does not mention specific training, validation, and test dataset splits in terms of percentages or sample counts for model training. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running its experiments. |
| Software Dependencies | No | The paper mentions using an "LSTM architecture" but does not specify version numbers for any software libraries, frameworks, or dependencies used in the experiments. |
| Experiment Setup | Yes | The GPC weak-controller is designed as in Equation 8, following (Agarwal et al., 2019), with the pre-fixed matrix K set to 0. The RNN weak-controller, using an LSTM architecture, with 5 hidden units. We set the memory length to H = 5, and use N = 5 weak-learners in all the experiments. The cost function used in all settings is c(x, u) = x 2 2 + u 2 2. each noise term wt is normally i.i.d. distributed with zero mean, and 0.12 variance. wt+1 N(wt, 0.32). wt = sin(t)/2π. wt N(wt 1, 5e-3), where the noise values are then clipped to the range [ 0.5, 0.5]. |