Training Over-parameterized Models with Non-decomposable Objectives

Authors: Harikrishna Narasimhan, Aditya K. Menon

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through experiments on benchmark image datasets, we showcase the effectiveness of our approach in training Res Net models with common robust and constrained optimization objectives. We trained Res Net-56 models on CIFAR-10 and CIFAR-100, and Res Net-18 models on Tiny Image Net, using SGD with momentum.
Researcher Affiliation Industry Harikrishna Narasimhan Google Research, Mountain View hnarasimhan@google.com Aditya Krishna Menon Google Research, New York adityakmenon@google.com
Pseudocode Yes Algorithm 1 Reductions-based Algorithm for Maximizing Worst-case Recall (1)
Open Source Code No Code will be made available at: https://github.com/google-research/google-research/tree/master/non_decomp
Open Datasets Yes We trained Res Net-56 models on CIFAR-10 and CIFAR-100, and Res Net-18 models on Tiny Image Net... [47] Alex Krizhevsky. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009. [52] Ya Le and Xuan Yang. Tiny imagenet visual recognition challenge. CS 231N, 2015.
Dataset Splits Yes In each case, we use a balanced validation sample of 5000 held-out images, and a balanced test set of the same size.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper does not provide specific software dependency details with version numbers (e.g., library or solver names with versions).
Experiment Setup Yes We trained Res Net-56 models on CIFAR-10 and CIFAR-100, and Res Net-18 models on Tiny Image Net, using SGD with momentum. We provide details about our hyper-parameters choices in Appendix E. For the CIFAR datasets, we perform 32 SGD steps on the cost-sensitive loss for every update on G, and for Tiny Image Net, we perform 100 SGD steps for every update on G.