AutoMTL: A Programming Framework for Automating Efficient Multi-Task Learning
Authors: Lijun Zhang, Xiao Liu, Hui Guan
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on three popular MTL benchmarks (City Scapes, NYUv2, Tiny-Taskonomy) demonstrate the effectiveness of Auto MTL over state-of-the-art approaches as well as the generalizability of Auto MTL across CNNs. |
| Researcher Affiliation | Academia | Lijun Zhang College of Information & Computer Sciences University of Massachusetts Amherst Amherst, MA, 01003 lijunzhang@cs.umass.edu Xiao Liu College of Information & Computer Sciences University of Massachusetts Amherst Amherst, MA, 01003 xiaoliu1990@cs.umass.edu Hui Guan College of Information & Computer Sciences University of Massachusetts Amherst Amherst, MA, 01003 huiguan@cs.umass.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Auto MTL is open-sourced and available at https://github.com/zhanglijun95/Auto MTL. |
| Open Datasets | Yes | Our experiments use three popular datasets in multi-task learning (MTL), City Scapes [10], NYUv2 [41], and Tiny-Taskonomy [50]. |
| Dataset Splits | Yes | All the data splits follow the experimental settings in Ada Share [43]. |
| Hardware Specification | Yes | The experiments were conducted on an Nvidia RTX8000. |
| Software Dependencies | No | The paper mentions 'Py Torch' but does not provide specific version numbers for it or any other software dependencies. |
| Experiment Setup | Yes | Auto MTL implements a three-stage training pipeline to generate a well-trained multi-task model. The first stage pre-train aims at obtaining a good initialization for the multi-task supermodel by pre-training on tasks jointly [51]. The second stage policy-train jointly optimizes the sharing policy and the model parameters. The last stage post-train trains the identified multi-task model until it converges. The overall loss L is defined as, i λi Li + λreg Lreg, where λi is a hyperparameter controlling how much each task contributes to the overall loss, and λreg is a hyper-parameter balancing task-specific losses and Lreg. Table 6 shows quantitative results on City Scapes with different λreg. |