Multi-block-Single-probe Variance Reduced Estimator for Coupled Compositional Optimization
Authors: Wei Jiang, Gang Li, Yibo Wang, Lijun Zhang, Tianbao Yang
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical studies on multi-task deep AUC maximization demonstrate the better performance of using the new estimator. In this section, we conduct experiments on the multi-task deep AUC maximization to evaluate the proposed methods and we will consider more applications in the long version of the paper. |
| Researcher Affiliation | Academia | 1National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China 2Department of Computer Science, the University of Iowa, Iowa City, USA 3Department of Computer Science and Engineering, Texas A&M University, College Station, USA |
| Pseudocode | Yes | Algorithm 1 MSVR-v1 and MSVR-v2 method. Algorithm 2 MSVR-v3 method. |
| Open Source Code | No | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [No] |
| Open Datasets | Yes | We use Res Net18 as backbone network, and train on six datasets: STL10 [Coates et al., 2011], CIFAR10 [Krizhevsky, 2009], CIFAR100 [Krizhevsky, 2009], MNIST [Le Cun et al., 1998], Fashion-MNIST [Xiao et al., 2017], and SVHN [Netzer et al., 2011]. |
| Dataset Splits | No | The paper mentions training on specific datasets but does not explicitly provide details about train/validation/test dataset splits, percentages, or methodology for creating them within the paper's text. |
| Hardware Specification | Yes | The experiments are conducted on single NVIDIA Tesla M40 GPU. |
| Software Dependencies | No | The paper does not specify software dependencies with version numbers, such as "Python 3.8", "PyTorch 1.9", or "CUDA 11.1". |
| Experiment Setup | Yes | For our methods, parameters α and β are searched from {0.1, 0.5, 0.9, 1.0}. For SOX algorithm, its parameters β and γ are searched from the same set. B1 is set as 50 for CIFAR100 and 5 for other datasets. Inner batch size B2 is chosen as 128 for all methods. We tune the learning rate from the set {1e 4, 1e 3, 2e 3, 5e 3, 1e 2} and pick the best one for each method. |