A Stochastic Derivative Free Optimization Method with Momentum

Authors: Eduard Gorbunov, Adel Bibi, Ozan Sener, El Houcine Bergou, Peter Richtarik

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments on continuous control tasks from the Mu Jo CO suit Todorov et al. (2012). SMTP significantly outperforms STP and all other methods that we considered in our numerical experiments.
Researcher Affiliation Collaboration Eduard Gorbunov MIPT, Russia and IITP RAS, Russia and RANEPA, Russia eduard.gorbunov@phystech.edu Adel Bibi KAUST, Saudi Arabia adel.bibi@kaust.edu.sa Ozan Sener Intel Labs ozan.sener@intel.com El Houcine Bergou KAUST, Saudi Arabia and Ma IAGE, INRA, France elhoucine.bergou@inra.fr Peter Richtárik KAUST, Saudi Arabia and MIPT, Russia peter.richtarik@kaust.edu.sa
Pseudocode Yes Algorithm 1 SMTP: Stochastic Momentum Three Points and Algorithm 2 SMTP_IS: Stochastic Momentum Three Points with Importance Sampling
Open Source Code No The code will be made available online upon acceptance of this work.
Open Datasets Yes We conduct extensive experiments3 on challenging non-convex problems on the continuous control task from the Mu Jo CO suit Todorov et al. (2012).
Dataset Splits Yes Similar to Bibi et al. (2019), these values were chosen based on the validation performance over the grid that is K {1, 2, 4, 8, 16} for the smaller dimensional problems Swimmer-v1, Hopper-v1, Half Cheetah-v1 and K {20, 40, 80, 120} for larger dimensional problems Ant-v1, and Humanoid-v1.
Hardware Specification No The paper does not explicitly mention any specific hardware details such as GPU/CPU models or memory specifications used for running experiments.
Software Dependencies No The paper mentions 'Mu Jo Co' but does not provide specific version numbers for this or any other software dependencies.
Experiment Setup Yes Similar to the work in Bibi et al. (2019), we use K = 2 for Swimmer-v1, K = 4 for both Hopper-v1 and Half Cheetah-v1, K = 40 for Ant-v1 and Humanoid-v1. As for the momentum term, for SMTP we set β = 0.5.