AutoMTL: A Programming Framework for Automating Efficient Multi-Task Learning

Authors: Lijun Zhang, Xiao Liu, Hui Guan

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on three popular MTL benchmarks (City Scapes, NYUv2, Tiny-Taskonomy) demonstrate the effectiveness of Auto MTL over state-of-the-art approaches as well as the generalizability of Auto MTL across CNNs.
Researcher Affiliation Academia Lijun Zhang College of Information & Computer Sciences University of Massachusetts Amherst Amherst, MA, 01003 lijunzhang@cs.umass.edu Xiao Liu College of Information & Computer Sciences University of Massachusetts Amherst Amherst, MA, 01003 xiaoliu1990@cs.umass.edu Hui Guan College of Information & Computer Sciences University of Massachusetts Amherst Amherst, MA, 01003 huiguan@cs.umass.edu
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Auto MTL is open-sourced and available at https://github.com/zhanglijun95/Auto MTL.
Open Datasets Yes Our experiments use three popular datasets in multi-task learning (MTL), City Scapes [10], NYUv2 [41], and Tiny-Taskonomy [50].
Dataset Splits Yes All the data splits follow the experimental settings in Ada Share [43].
Hardware Specification Yes The experiments were conducted on an Nvidia RTX8000.
Software Dependencies No The paper mentions 'Py Torch' but does not provide specific version numbers for it or any other software dependencies.
Experiment Setup Yes Auto MTL implements a three-stage training pipeline to generate a well-trained multi-task model. The first stage pre-train aims at obtaining a good initialization for the multi-task supermodel by pre-training on tasks jointly [51]. The second stage policy-train jointly optimizes the sharing policy and the model parameters. The last stage post-train trains the identified multi-task model until it converges. The overall loss L is defined as, i λi Li + λreg Lreg, where λi is a hyperparameter controlling how much each task contributes to the overall loss, and λreg is a hyper-parameter balancing task-specific losses and Lreg. Table 6 shows quantitative results on City Scapes with different λreg.