AdaTask: A Task-Aware Adaptive Learning Rate Approach to Multi-Task Learning

Authors: Enneng Yang, Junwei Pan, Ximei Wang, Haibin Yu, Li Shen, Xihua Chen, Lei Xiao, Jie Jiang, Guibing Guo

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments on computer vision and recommender system MTL datasets demonstrate that Ada Task significantly improves the performance of dominated tasks, resulting SOTA average task-wise performance. In this section, we experimentally verify the effectiveness of the proposed Ada Task. We conduct experiments on a synthetic dataset, and several real-world datasets: the City Scapes dataset in computer vision, and the Tik Tok and We Chat datasets in recommender systems.
Researcher Affiliation Collaboration Enneng Yang1* , Junwei Pan2*, Ximei Wang2, Haibin Yu2, Li Shen3 Xihua Chen2, Lei Xiao2, Jie Jiang2, Guibing Guo1 1 Northeastern University, China 2 Tencent Inc, China 3 JD Explore Academy, China
Pseudocode Yes Algorithm 1: Ada Grad, RMSProp and Adam in MTL. Algorithm 2: Ada Grad, RMSProp and Adam with Ada Task in MTL.
Open Source Code No No explicit statement or link confirming the availability of open-source code for the described methodology was found.
Open Datasets Yes We conduct experiments on a synthetic dataset, and several real-world datasets: the City Scapes dataset in computer vision, and the Tik Tok and We Chat datasets in recommender systems. Extensive experiments on the synthetic and three public datasets from the CV and recommendation demonstrate that Ada Task significantly improves the performance of the dominated task(s), while achieving SOTA average task-wise performance.
Dataset Splits No More details of our experiment, including baselines, datasets, and implementation details, can be found in our full version (Yang et al. 2022b). The main paper does not provide specific train/validation/test split percentages or sample counts for any of the datasets.
Hardware Specification No No specific hardware details (like GPU/CPU models, memory, or cloud instance types) used for running experiments were provided in the text.
Software Dependencies No No specific software dependencies with version numbers (e.g., library names with versions) were provided in the text.
Experiment Setup No The paper mentions general optimizer types (RMSProp, Adam) and network architecture (4-layer fully connected neural network) along with some general hyperparameters (η, γ, ϵ, γ1, γ2) but does not provide specific numerical values for these, or other common experimental setup details like batch size, number of epochs, or specific learning rates. It defers to a 'full version' for implementation details.