reproducibilityindex.ai

Efficient and Effective Multi-task Grouping via Meta Learning on Task Combinations

Authors: Xiaozhuang Song, Shun Zheng, Wei Cao, James Yu, Jiang Bian

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on diversified multi-task scenarios demonstrate the efficiency and effectiveness of our method.
Researcher Affiliation	Collaboration	Xiaozhuang Song Southern University of Science and Technology shawnsxz97@gmail.com Shun Zheng Microsoft Research shun.zheng@microsoft.com Wei Cao Microsoft Research wei.cao@microsoft.com James J.Q. Yu Southern University of Science and Technology yujq3@sustech.edu.cn Jiang Bian Microsoft Research jiang.bian@microsoft.com
Pseudocode	Yes	Algorithm 1: Active Learning for MTG-Net
Open Source Code	Yes	Data and code are available at https://github.com/ShawnKS/MTG-Net.
Open Datasets	Yes	Taskonomy [49] is a computer vision dataset... ETTm1 [46] is an electric load dataset... MIMIC-III [17] is a healthcare database...
Dataset Splits	Yes	we retain the same split of train, validation, and test sets for each MTL procedure and fix the optimization algorithm as well as other hyperparameters.
Hardware Specification	No	All these MTL procedures cost thousands of GPU hours in total, and we will release the collected meta datasets for future research. No specific GPU model, CPU, or cloud provider details are provided.
Software Dependencies	No	The paper mentions using neural networks and various learning paradigms but does not specify any software libraries or their version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	We use the same hyper-parameters for all MTL datasets. To be specific, we set the dimension of task embeddings as D = 64 and stack 2 self-attention encoding layers [45]. As for Algorithm 1, we set α as 25 to prioritize the selection of task combinations with large gains. Besides, we leverage a dynamic strategy to schedule η. At an early stage when \|Ctrain\| <= N +1, we set η as 1 to frequently updating MTG-Net to pursue more effective selections. When \|Ctrain\| > N + 1, we set η as N to reduce the number of updating for MTG-Net to further improve efficiency. K is the hyper-parameter deciding the total number of meta-training samples, which is specified along with each figure.