Multi-Task Learning as Multi-Objective Optimization

Authors: Ozan Sener, Vladlen Koltun

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply our method to a variety of multi-task deep learning problems including digit classification, scene understanding (joint semantic segmentation, instance segmentation, and depth estimation), and multilabel classification. Our method produces higher-performing models than recent multi-task learning formulations or per-task training.
Researcher Affiliation -1 The paper only lists "Vladlen Koltun" as the author and the conference details (Neur IPS 2018). No institutional affiliations, company names, or email domains are provided in the paper text.
Pseudocode Yes Algorithm 1: minγ2[0,1] kγ + (1 γ) k2; Algorithm 2: Update Equations for MTL
Open Source Code No The paper does not contain any explicit statement about making its source code publicly available, nor does it provide a link to a code repository.
Open Datasets Yes First, we use Multi MNIST (Sabour et al., 2017), an MTL adaptation of MNIST (Le Cun et al., 1998). Next, we tackle multi-label classification on the Celeb A dataset (Liu et al., 2015b)... Finally, we experiment with scene understanding... Cityscapes dataset (Cordts et al., 2016).
Dataset Splits Yes We use Multi MNIST (Sabour et al., 2017), an MTL adaptation of MNIST (Le Cun et al., 1998)... We use 60K examples and directly apply existing single-task MNIST models. (Implies use of standard MNIST dataset splits; similar for Celeb A and Cityscapes by citation).
Hardware Specification Yes runtime measured on a single Titan Xp GPU
Software Dependencies No The paper mentions "Automatic differentiation in PyTorch" in its references, but it does not specify the version of PyTorch used or any other software dependencies with their specific version numbers.
Experiment Setup No The paper describes the models used (e.g., Le Net, Res Net-18, Res Net-50) and refers to the supplement for some architectural details, but it does not provide specific hyperparameters (e.g., learning rate, batch size, number of epochs) or detailed training configurations in the main text.