A Stochastic Derivative Free Optimization Method with Momentum
Authors: Eduard Gorbunov, Adel Bibi, Ozan Sener, El Houcine Bergou, Peter Richtarik
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on continuous control tasks from the Mu Jo CO suit Todorov et al. (2012). SMTP significantly outperforms STP and all other methods that we considered in our numerical experiments. |
| Researcher Affiliation | Collaboration | Eduard Gorbunov MIPT, Russia and IITP RAS, Russia and RANEPA, Russia eduard.gorbunov@phystech.edu Adel Bibi KAUST, Saudi Arabia adel.bibi@kaust.edu.sa Ozan Sener Intel Labs ozan.sener@intel.com El Houcine Bergou KAUST, Saudi Arabia and Ma IAGE, INRA, France elhoucine.bergou@inra.fr Peter Richtárik KAUST, Saudi Arabia and MIPT, Russia peter.richtarik@kaust.edu.sa |
| Pseudocode | Yes | Algorithm 1 SMTP: Stochastic Momentum Three Points and Algorithm 2 SMTP_IS: Stochastic Momentum Three Points with Importance Sampling |
| Open Source Code | No | The code will be made available online upon acceptance of this work. |
| Open Datasets | Yes | We conduct extensive experiments3 on challenging non-convex problems on the continuous control task from the Mu Jo CO suit Todorov et al. (2012). |
| Dataset Splits | Yes | Similar to Bibi et al. (2019), these values were chosen based on the validation performance over the grid that is K {1, 2, 4, 8, 16} for the smaller dimensional problems Swimmer-v1, Hopper-v1, Half Cheetah-v1 and K {20, 40, 80, 120} for larger dimensional problems Ant-v1, and Humanoid-v1. |
| Hardware Specification | No | The paper does not explicitly mention any specific hardware details such as GPU/CPU models or memory specifications used for running experiments. |
| Software Dependencies | No | The paper mentions 'Mu Jo Co' but does not provide specific version numbers for this or any other software dependencies. |
| Experiment Setup | Yes | Similar to the work in Bibi et al. (2019), we use K = 2 for Swimmer-v1, K = 4 for both Hopper-v1 and Half Cheetah-v1, K = 40 for Ant-v1 and Humanoid-v1. As for the momentum term, for SMTP we set β = 0.5. |