Scalable Planning with Tensorflow for Hybrid Nonlinear Domains
Authors: Ga Wu, Buser Say, Scott Sanner
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we introduce our three benchmark domains and then validate Tensorflow planning performance in the following steps. (1) We evaluate the optimality of the Tensorflow backpropagation planning on linear and bilinear domains through comparison with the optimal solution given by Mixture Integer Linear Programming (MILP). (2) We evaluate the performance of Tensorflow backpropagation planning on nonlinear domains (that MILPs cannot handle) through comparison with the Matlab-based interior point nonlinear solver FMINCON. (4) We investigate the impact of several popular gradient descent optimizers on planning performance. (5) We evaluate optimization of the learning rate. (6) We investigate how other state-of-the-art hybrid planners perform. |
| Researcher Affiliation | Academia | Ga Wu Buser Say Scott Sanner Department of Mechanical & Industrial Engineering, University of Toronto, Canada email: {wuga,bsay,ssanner}@mie.utoronto.ca |
| Pseudocode | No | The paper describes its methods verbally and mathematically but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access information (specific link, explicit statement, or mention in supplementary materials) for the open-source code of the methodology described in this paper. |
| Open Datasets | No | The paper defines and uses simulated environments (Navigation, Reservoir Control, HVAC) rather than pre-existing, publicly available datasets. Therefore, it does not provide concrete access information for a public dataset. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) as it operates on simulated environments rather than static datasets with explicit splits. |
| Hardware Specification | Yes | We ran our experiments on Ubuntu Linux system with one E5-1620 v4 CPU, 16GB RAM, and one GTX1080 GPU. |
| Software Dependencies | Yes | The Tensorflow version is beta 0.12.1, the Matlab version is R2016b, and the MILP version is IBM ILOG CPLEX 12.6.3. |
| Experiment Setup | Yes | In this experiment, we investigate the effects of different backpropagation optimizers. In figure 6(a), we show that the RMSProp optimizer provides exceptionally fast convergence among the five standard optimizers of Tensorflow. This observation reflects the previous analysis and discussion concerning equation (4) that RMSProp manages to avoid exploding gradients. As mentioned, although Adagrad and Adadelta have similar mechanisms, their normalization methods may cause vanishing gradients after several epochs, which corresponds to our observation of nearly flat curves for these methods. This is a strong indicator that exploding gradients are a significant concern for hybrid planning with gradient descent and that RMSProp performs well despite this well-known potential problem for gradients over long horizons. |