Optimal Control Via Neural Networks: A Convex Approach

Authors: Yize Chen, Yuanyuan Shi, Baosen Zhang

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiment results demonstrate the good potential of the proposed input convex neural network based approach in a variety of control applications. In particular we show that in the Mu Jo Co locomotion tasks, we could achieve over 10% higher performance using 5 less time compared with state-of-the-art model-based reinforcement learning method; and in the building HVAC control example, our method achieved up to 20% energy reduction compared with classic linear models.
Researcher Affiliation Academia Yize Chen , Yuanyuan Shi , Baosen Zhang Department of Electrical and Computer Engineering, University of Washington, Seattle, WA 98195, USA {yizechen, yyshi, zhangbao}@uw.edu
Pseudocode No The paper describes the algorithms and methods in text and mathematical equations but does not include a formal pseudocode block or algorithm box.
Open Source Code No The paper does not provide an explicit statement about releasing open-source code or include any links to a code repository for the described methodology.
Open Datasets No The paper uses simulated environments (MuJoCo and EnergyPlus) to generate data for its experiments but does not provide access to a pre-existing publicly available dataset or the generated data itself.
Dataset Splits No While the paper mentions “validation rollouts” for monitoring performance during iterative training, it does not provide specific dataset split percentages or sample counts for train/validation/test splits of a static dataset needed for reproducibility. For the building example, it only mentions “10 months data to train” and “2 months data for testing” without a validation split.
Hardware Specification Yes All the experiments are running on a computer with 8 cores Intel I7 6700 CPU.
Software Dependencies No The paper mentions software like TensorFlow, MuJoCo, OpenAI rllab framework, and EnergyPlus, but it does not specify version numbers for any of these dependencies, which is required for reproducibility.
Experiment Setup Yes We train both models using Adam optimizer with a learning rate 0.001 and a mini-batch size of 512. Due to the different complexity of Mu Jo Co tasks, we vary training epochs and summarize the training details in Table. 1. ... We set the model predictive control horizon T = 36 (six hours). We employ an ICRNN with recurrent layer of dimension 200 to fit the building input-output dynamics f( ). The model is trained to minimize the MSE between its predictions and the actual building energy consumption using stochastic gradient descent. We use the same network structure and training scheme to fit state transition dynamics g( ).