Optimal Control Via Neural Networks: A Convex Approach
Authors: Yize Chen, Yuanyuan Shi, Baosen Zhang
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiment results demonstrate the good potential of the proposed input convex neural network based approach in a variety of control applications. In particular we show that in the Mu Jo Co locomotion tasks, we could achieve over 10% higher performance using 5 less time compared with state-of-the-art model-based reinforcement learning method; and in the building HVAC control example, our method achieved up to 20% energy reduction compared with classic linear models. |
| Researcher Affiliation | Academia | Yize Chen , Yuanyuan Shi , Baosen Zhang Department of Electrical and Computer Engineering, University of Washington, Seattle, WA 98195, USA {yizechen, yyshi, zhangbao}@uw.edu |
| Pseudocode | No | The paper describes the algorithms and methods in text and mathematical equations but does not include a formal pseudocode block or algorithm box. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing open-source code or include any links to a code repository for the described methodology. |
| Open Datasets | No | The paper uses simulated environments (MuJoCo and EnergyPlus) to generate data for its experiments but does not provide access to a pre-existing publicly available dataset or the generated data itself. |
| Dataset Splits | No | While the paper mentions “validation rollouts” for monitoring performance during iterative training, it does not provide specific dataset split percentages or sample counts for train/validation/test splits of a static dataset needed for reproducibility. For the building example, it only mentions “10 months data to train” and “2 months data for testing” without a validation split. |
| Hardware Specification | Yes | All the experiments are running on a computer with 8 cores Intel I7 6700 CPU. |
| Software Dependencies | No | The paper mentions software like TensorFlow, MuJoCo, OpenAI rllab framework, and EnergyPlus, but it does not specify version numbers for any of these dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | We train both models using Adam optimizer with a learning rate 0.001 and a mini-batch size of 512. Due to the different complexity of Mu Jo Co tasks, we vary training epochs and summarize the training details in Table. 1. ... We set the model predictive control horizon T = 36 (six hours). We employ an ICRNN with recurrent layer of dimension 200 to fit the building input-output dynamics f( ). The model is trained to minimize the MSE between its predictions and the actual building energy consumption using stochastic gradient descent. We use the same network structure and training scheme to fit state transition dynamics g( ). |