Scalable Planning with Tensorflow for Hybrid Nonlinear Domains

Authors: Ga Wu, Buser Say, Scott Sanner

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we introduce our three benchmark domains and then validate Tensorflow planning performance in the following steps. (1) We evaluate the optimality of the Tensorflow backpropagation planning on linear and bilinear domains through comparison with the optimal solution given by Mixture Integer Linear Programming (MILP). (2) We evaluate the performance of Tensorflow backpropagation planning on nonlinear domains (that MILPs cannot handle) through comparison with the Matlab-based interior point nonlinear solver FMINCON. (4) We investigate the impact of several popular gradient descent optimizers on planning performance. (5) We evaluate optimization of the learning rate. (6) We investigate how other state-of-the-art hybrid planners perform.
Researcher Affiliation Academia Ga Wu Buser Say Scott Sanner Department of Mechanical & Industrial Engineering, University of Toronto, Canada email: {wuga,bsay,ssanner}@mie.utoronto.ca
Pseudocode No The paper describes its methods verbally and mathematically but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any concrete access information (specific link, explicit statement, or mention in supplementary materials) for the open-source code of the methodology described in this paper.
Open Datasets No The paper defines and uses simulated environments (Navigation, Reservoir Control, HVAC) rather than pre-existing, publicly available datasets. Therefore, it does not provide concrete access information for a public dataset.
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) as it operates on simulated environments rather than static datasets with explicit splits.
Hardware Specification Yes We ran our experiments on Ubuntu Linux system with one E5-1620 v4 CPU, 16GB RAM, and one GTX1080 GPU.
Software Dependencies Yes The Tensorflow version is beta 0.12.1, the Matlab version is R2016b, and the MILP version is IBM ILOG CPLEX 12.6.3.
Experiment Setup Yes In this experiment, we investigate the effects of different backpropagation optimizers. In figure 6(a), we show that the RMSProp optimizer provides exceptionally fast convergence among the five standard optimizers of Tensorflow. This observation reflects the previous analysis and discussion concerning equation (4) that RMSProp manages to avoid exploding gradients. As mentioned, although Adagrad and Adadelta have similar mechanisms, their normalization methods may cause vanishing gradients after several epochs, which corresponds to our observation of nearly flat curves for these methods. This is a strong indicator that exploding gradients are a significant concern for hybrid planning with gradient descent and that RMSProp performs well despite this well-known potential problem for gradients over long horizons.