Relational Surrogate Loss Learning
Authors: Tao Huang, Zekang Li, Hua Lu, Yong Shan, Shusheng Yang, Yang Feng, Fei Wang, Shan You, Chang Xu
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that our method achieves improvements on various tasks including image classification and neural machine translation, and even outperforms state-of-theart methods on human pose estimation and machine reading comprehension tasks. |
| Researcher Affiliation | Collaboration | Tao Huang12, Zekang Li3, Hua Lu4, Yong Shan3, Shusheng Yang4, Yang Feng3, Fei Wang5, Shan You2 , Chang Xu1 1School of Computer Science, Faculty of Engineering, The University of Sydney 2Sense Time Research 3Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences (ICT/CAS) 4Huazhong University of Science and Technology 5University of Science and Technology of China |
| Pseudocode | Yes | Algorithm 1 Learning of surrogate losses. Input: surrogate loss L with random weights θl, batch size N, metric function M, data generators GM and GR, sample probability p. Output: learned surrogate loss with highest correlation. |
| Open Source Code | Yes | Code is available at: https://github.com/hunto/Re Loss. |
| Open Datasets | Yes | We conduct experiments on three benchmark datasets CIFAR-10, CIFAR100 (Krizhevsky et al., 2009), and Image Net (Deng et al., 2009). |
| Dataset Splits | Yes | On CIFAR-10 and CIFAR-100 datasets, we train Res Net-20 for 200 epochs with an initial learning rate of 0.1, which decays 0.1 at 100th and 150th epochs, the batch size is set to 128 with cutout (De Vries & Taylor, 2017) data augmentation, we run each experiment 5 times with different random seeds and report their mean accuracy with standard derivation." and "We take news-dev-2016 and news-test-2016 as development and test sets. |
| Hardware Specification | Yes | Note that our additional cost O(Tl) of learning Re Loss costs only 0.5 GPU hour on image classification with a single NVIDIA TITAN Xp GPU, and we only need to train Re Loss once for each task, reducing much computational cost compared to previous works. |
| Software Dependencies | No | The paper mentions software components like 'MMPose' and 'torchvision', and optimizers like 'Adam' and 'SGD', but does not provide specific version numbers for these software dependencies, nor does it list programming language versions. |
| Experiment Setup | Yes | On CIFAR-10 and CIFAR-100 datasets, we train Res Net-20 for 200 epochs with an initial learning rate of 0.1, which decays 0.1 at 100th and 150th epochs, the batch size is set to 128 with cutout (De Vries & Taylor, 2017) data augmentation, we run each experiment 5 times with different random seeds and report their mean accuracy with standard derivation. |