Module-Aware Optimization for Auxiliary Learning

Authors: Hong Chen, Xin Wang, Yue Liu, Yuwei Zhou, Chaoyu Guan, Wenwu Zhu

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that our proposed MAOAL method consistently outperforms state-of-the-art baselines for different auxiliary losses on various datasets, demonstrating that our method can serve as a powerful generic tool for auxiliary learning.
Researcher Affiliation Academia Hong Chen, Xin Wang , Yue Liu, Yuwei Zhou, Chaoyu Guan, Wenwu Zhu Tsinghua University {h-chen20,liuyue17,zhou-yw21,guancy19}@mails.tsinghua.edu.cn {xin_wang,wwzhu}@tsinghua.edu.cn
Pseudocode Yes Algorithm 1 Module-Aware Optimization for Auxiliary Learning (MAOAL)
Open Source Code Yes Our code will be released at https://github.com/forchchch/MAOAL
Open Datasets Yes We conduct experiments on two fine-grained image classification datasets, CUB [44] and Oxford-IIIT Pet [45], and two widely adopted general image classification datasets, CIFAR10 and CIFAR100 [46]. ... We evaluate our methods on two datasets with different sparsity, Amazon Beauty [47] and Movie Lens1M [48]. ... Additionally, based on the reviews, we add additional experiments on the NYUv2 [50] and CIFAR100/20 datasets...
Dataset Splits Yes The input for the algorithm contains three datasets: the training dataset Dtrain, the developing dataset Ddev and the validation dataset Dv
Hardware Specification No The paper states 'Refer to the supplementary file' for compute resources and hardware details, meaning this information is not provided in the main text.
Software Dependencies No The paper mentions general models and frameworks like 'Res Net18', '4-layer convolutional network(Conv Net)', and 'Auto INT', but does not provide specific software versions for libraries, frameworks, or languages (e.g., Python, PyTorch versions).
Experiment Setup Yes The input for the algorithm contains three datasets: the training dataset Dtrain, the developing dataset Ddev and the validation dataset Dv, and the hyperparameters. T and η1 are used for the lower optimization, where T is the steps we conduct lower optimization in one loop and η1 is the learning rate for the lower optimization. M and η2 are used for the upper optimization, where M is the total looking-back steps in Eq.(9) and η2 is the learning rate for the upper optimization. ... We implement the task-specific heads with Multi-layer Perceptron(MLP) whose layer number is searched in {1, 2}.