AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning

Authors: Ximeng Sun, Rameswar Panda, Rogerio Feris, Kate Saenko

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on several challenging and diverse benchmark datasets with a variable number of tasks well demonstrate the efficacy of our approach over state-of-the-art methods. and Quantitative Results. Table 1-4 show the task performance in four different learning scenarios, namely NYU-v2 2-Task Learning, City Scapes 2-Task Learning, NYU-v2 3-Task Learning and Tiny Taskonomy 5-Task Learning.
Researcher Affiliation Collaboration 1Boston University, 2MIT-IBM Watson AI Lab, IBM Research
Pseudocode No The paper describes its methods using text and mathematical equations but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks or figures.
Open Source Code No Project page: https://cs-people.bu.edu/sunxm/Ada Share/project.html. (Abstract) - This is a project overview page and does not provide direct access to the source code for download.
Open Datasets Yes We evaluate the performance of our approach using several standard datasets, namely NYU v2 [40]... City Scapes [11]... and Tiny Taskonomy [68]...
Dataset Splits No The paper refers to 'training splits' and 'validation' in the context of experiments and metrics (e.g., in Table 5), and mentions using 'standard datasets' which typically have predefined splits. However, it does not explicitly state the specific percentages or sample counts for training, validation, and test splits used in their experiments, nor does it cite the exact split methodology if it deviates from standard practice for the datasets.
Hardware Specification No The paper discusses the deep learning architectures used (e.g., Res Net-34, Res Net-18) but does not provide any specific details about the hardware (e.g., GPU models, CPU types, or memory) used to run the experiments.
Software Dependencies No The paper mentions using specific optimizers like Adam [28] and SGD, and references deep learning architectures like Deeplab-Res Net [9] and VD-CNN [10]. However, it does not specify any software dependencies such as programming language versions (e.g., Python 3.x) or deep learning framework versions (e.g., PyTorch 1.x, TensorFlow 2.x) that are necessary for reproducibility.
Experiment Setup Yes Experimental Settings. We use Deeplab-Res Net [9]... as our backbone and the ASPP [9] architecture as task-specific heads... Following [61], we use Adam [28] to update the policy distribution parameters and SGD to update the network parameters... We use cross-entropy loss for Semantic Segmentation as well as classification tasks, and the inverse of cosine similarity between the normalized prediction and ground truth for Surface Normal Prediction. L1 loss is used for all other tasks.