Finite-Sum Coupled Compositional Stochastic Optimization: Theory and Applications

Authors: Bokun Wang, Tianbao Yang

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical experiments on AP maximization, NCA and p-norm push corroborate some aspects of the theory.
Researcher Affiliation Academia Bokun Wang 1 Tianbao Yang 1 1Department of Computer Science, The University of Iowa, IA, USA. Correspondence to: Bokun Wang <bokunw.wang@gmail.com>, Tianbao Yang <tianbaoyang@uiowa.edu>.
Pseudocode Yes Algorithm 1 SOX(w0,u0,v0,η, β, γ, T) Algorithm 2 SOX-boost(w1, u1, v1, K)
Open Source Code No The paper does not contain an explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We conduct experiments on two image datasets, namely CIFAR-10, CIFAR-100. We conduct our experiment on two Lib SVM datasets: covtype and ijcnn1. The experiment is performed on three datasets: sensorless, usps, and mnist from the Lib SVM (Chang & Lin, 2011).
Dataset Splits Yes To prevent overfitting, algorithms are early stopped when the validation loss reaches the minimum.
Hardware Specification Yes The experiments are performed on a node of a cluster with single Ge Force RTX 2080 Ti GPU. The algorithms are implemented with Python and run on a server with 12-core Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz.
Software Dependencies No The paper states "The algorithms are implemented with Python", but does not provide specific version numbers for Python or any other software libraries or dependencies used in the experiments.
Experiment Setup Yes In all experiments, we tune the initial learning rate in a range 10 4:1: 1 to achieve the best validation error, and decrease the learning rate at 50% and 75% of total epochs. We tune the value of γ and fix β = 0.1 (same as the default value 0.9 of gradient momentum). For the stochastic algorithms (BSGD, SOAP, MOAP, SOX), we choose B = 64 and B1 = B2.