Multi-Task Structural Learning using Local Task Similarity induced Neuron Creation and Removal
Authors: Naresh Kumar Gurulingan, Bahram Zonooz, Elahe Arani
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical results show that MTSL achieves competitive generalization with various baselines and improves robustness to out-of-distribution data. Code available at https://github.com/Neur AI-Lab/MTSL. We evaluate the strengths of our approach using two datasets, namely Cityscapes (Cordts et al., 2016) and NYUv2 (Silberman et al., 2012). |
| Researcher Affiliation | Collaboration | 1Advanced Research Lab, Nav Info Europe, Netherlands 2Dep. of Mathematics and Computer Science, Eindhoven University of Technology, Netherlands. |
| Pseudocode | Yes | Algorithm 1 MTSL algorithm |
| Open Source Code | Yes | Code available at https://github.com/Neur AI-Lab/MTSL. |
| Open Datasets | Yes | We evaluate the strengths of our approach using two datasets, namely Cityscapes (Cordts et al., 2016) and NYUv2 (Silberman et al., 2012). |
| Dataset Splits | Yes | The Cityscapes dataset is an outdoor driving scenes dataset consisting of 2975 training images and 500 validation images. The NYUv2 dataset consists of indoor scenes with a total of 795 training images and 654 validation images, respectively. |
| Hardware Specification | Yes | Each experiment has been run on a Nvidia Tesla V100 GPU in a DGX cluster. |
| Software Dependencies | No | This information is insufficient. The paper mentions software components like 'Adam optimizer' and 'Res Net18 backbone with Deep Lab head' but does not specify their version numbers or other crucial software dependencies with versions. |
| Experiment Setup | Yes | For training, we use the Adam optimizer with a learning rate of 1e-4 for 80 epochs. We use the step-wise learning schedule with steps at epochs 60 and 70. The batch size used is 16, a weight decay of 5e-5 and we equally weigh all task losses (all task losses have a weight of 1). |