Multi-block-Single-probe Variance Reduced Estimator for Coupled Compositional Optimization

Authors: Wei Jiang, Gang Li, Yibo Wang, Lijun Zhang, Tianbao Yang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical studies on multi-task deep AUC maximization demonstrate the better performance of using the new estimator. In this section, we conduct experiments on the multi-task deep AUC maximization to evaluate the proposed methods and we will consider more applications in the long version of the paper.
Researcher Affiliation Academia 1National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China 2Department of Computer Science, the University of Iowa, Iowa City, USA 3Department of Computer Science and Engineering, Texas A&M University, College Station, USA
Pseudocode Yes Algorithm 1 MSVR-v1 and MSVR-v2 method. Algorithm 2 MSVR-v3 method.
Open Source Code No Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [No]
Open Datasets Yes We use Res Net18 as backbone network, and train on six datasets: STL10 [Coates et al., 2011], CIFAR10 [Krizhevsky, 2009], CIFAR100 [Krizhevsky, 2009], MNIST [Le Cun et al., 1998], Fashion-MNIST [Xiao et al., 2017], and SVHN [Netzer et al., 2011].
Dataset Splits No The paper mentions training on specific datasets but does not explicitly provide details about train/validation/test dataset splits, percentages, or methodology for creating them within the paper's text.
Hardware Specification Yes The experiments are conducted on single NVIDIA Tesla M40 GPU.
Software Dependencies No The paper does not specify software dependencies with version numbers, such as "Python 3.8", "PyTorch 1.9", or "CUDA 11.1".
Experiment Setup Yes For our methods, parameters α and β are searched from {0.1, 0.5, 0.9, 1.0}. For SOX algorithm, its parameters β and γ are searched from the same set. B1 is set as 50 for CIFAR100 and 5 for other datasets. Inner batch size B2 is chosen as 128 for all methods. We tune the learning rate from the set {1e 4, 1e 3, 2e 3, 5e 3, 1e 2} and pick the best one for each method.