Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Stochastic Derivative Free Optimization Method with Momentum

Authors: Eduard Gorbunov, Adel Bibi, Ozan Sener, El Houcine Bergou, Peter Richtarik

ICLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments on continuous control tasks from the Mu Jo CO suit Todorov et al. (2012). SMTP significantly outperforms STP and all other methods that we considered in our numerical experiments.
Researcher Affiliation Collaboration Eduard Gorbunov MIPT, Russia and IITP RAS, Russia and RANEPA, Russia EMAIL Adel Bibi KAUST, Saudi Arabia EMAIL Ozan Sener Intel Labs EMAIL El Houcine Bergou KAUST, Saudi Arabia and Ma IAGE, INRA, France EMAIL Peter Richtárik KAUST, Saudi Arabia and MIPT, Russia EMAIL
Pseudocode Yes Algorithm 1 SMTP: Stochastic Momentum Three Points and Algorithm 2 SMTP_IS: Stochastic Momentum Three Points with Importance Sampling
Open Source Code No The code will be made available online upon acceptance of this work.
Open Datasets Yes We conduct extensive experiments3 on challenging non-convex problems on the continuous control task from the Mu Jo CO suit Todorov et al. (2012).
Dataset Splits Yes Similar to Bibi et al. (2019), these values were chosen based on the validation performance over the grid that is K {1, 2, 4, 8, 16} for the smaller dimensional problems Swimmer-v1, Hopper-v1, Half Cheetah-v1 and K {20, 40, 80, 120} for larger dimensional problems Ant-v1, and Humanoid-v1.
Hardware Specification No The paper does not explicitly mention any specific hardware details such as GPU/CPU models or memory specifications used for running experiments.
Software Dependencies No The paper mentions 'Mu Jo Co' but does not provide specific version numbers for this or any other software dependencies.
Experiment Setup Yes Similar to the work in Bibi et al. (2019), we use K = 2 for Swimmer-v1, K = 4 for both Hopper-v1 and Half Cheetah-v1, K = 40 for Ant-v1 and Humanoid-v1. As for the momentum term, for SMTP we set β = 0.5.