AdaTask: A Task-Aware Adaptive Learning Rate Approach to Multi-Task Learning
Authors: Enneng Yang, Junwei Pan, Ximei Wang, Haibin Yu, Li Shen, Xihua Chen, Lei Xiao, Jie Jiang, Guibing Guo
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments on computer vision and recommender system MTL datasets demonstrate that Ada Task significantly improves the performance of dominated tasks, resulting SOTA average task-wise performance. In this section, we experimentally verify the effectiveness of the proposed Ada Task. We conduct experiments on a synthetic dataset, and several real-world datasets: the City Scapes dataset in computer vision, and the Tik Tok and We Chat datasets in recommender systems. |
| Researcher Affiliation | Collaboration | Enneng Yang1* , Junwei Pan2*, Ximei Wang2, Haibin Yu2, Li Shen3 Xihua Chen2, Lei Xiao2, Jie Jiang2, Guibing Guo1 1 Northeastern University, China 2 Tencent Inc, China 3 JD Explore Academy, China |
| Pseudocode | Yes | Algorithm 1: Ada Grad, RMSProp and Adam in MTL. Algorithm 2: Ada Grad, RMSProp and Adam with Ada Task in MTL. |
| Open Source Code | No | No explicit statement or link confirming the availability of open-source code for the described methodology was found. |
| Open Datasets | Yes | We conduct experiments on a synthetic dataset, and several real-world datasets: the City Scapes dataset in computer vision, and the Tik Tok and We Chat datasets in recommender systems. Extensive experiments on the synthetic and three public datasets from the CV and recommendation demonstrate that Ada Task significantly improves the performance of the dominated task(s), while achieving SOTA average task-wise performance. |
| Dataset Splits | No | More details of our experiment, including baselines, datasets, and implementation details, can be found in our full version (Yang et al. 2022b). The main paper does not provide specific train/validation/test split percentages or sample counts for any of the datasets. |
| Hardware Specification | No | No specific hardware details (like GPU/CPU models, memory, or cloud instance types) used for running experiments were provided in the text. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., library names with versions) were provided in the text. |
| Experiment Setup | No | The paper mentions general optimizer types (RMSProp, Adam) and network architecture (4-layer fully connected neural network) along with some general hyperparameters (η, γ, ϵ, γ1, γ2) but does not provide specific numerical values for these, or other common experimental setup details like batch size, number of epochs, or specific learning rates. It defers to a 'full version' for implementation details. |