Minibatch Stochastic Three Points Method for Unconstrained Smooth Minimization

Authors: Soumia Boucherouite, Grigory Malinovsky, Peter Richtárik, El Houcine Bergou

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform extensive numerical evaluations to assess the computational efficiency of Mi STP and compare its performance to other state-of-the-art methods by testing it on several machine learning tasks.
Researcher Affiliation Academia Soumia Boucherouite1, Grigory Malinovsky2, Peter Richt arik2, El Houcine Bergou1 1College of Computing, Mohammed VI Polytechnic University, Ben Guerir, Morocco 2King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
Pseudocode Yes Algorithm 1: Minibatch Stochastic Three Points (Mi STP)
Open Source Code Yes All codes for the experiments are available at: https://github. com/Soumia Bouch/Minibatch-STP.
Open Datasets Yes The experiments of this section are conducted using LIBSVM datasets (Chang and Lin 2011).
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) for training, validation, or testing.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details, such as library names with version numbers.
Experiment Setup Yes For each minibatch size, we choose the learning rate α by performing a grid search on the values 1,0.1,0.01,... and select the one that gives the best performance. [...] The architecture we used has three fully-connected layers of size 256, 128, 10, with Re LU activation after the first two layers and a Softmax activation function after the last layer. The loss function is the categorical cross entropy. [...] we generate an adversarial attack to a set of n = 10 images of class 1 using a minibatch size of τ = 5 and a fixed stepsize α = 2 for Mi STP, α = 5/d for ZO-SVRG, and α = 30/d for both RSGF and ZO-SVRG-Ave. We set the epoch length to 10, µ = 0.01, and c = 1.