Generative multitask learning mitigates target-causing confounding

Authors: Taro Makino, Krzysztof Geras, Kyunghyun Cho

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our results on the Attributes of People and Taskonomy datasets reflect an improved robustness to target shift across four multitask learning methods.
Researcher Affiliation Collaboration 1NYU Center for Data Science 2NYU Grossman School of Medicine 3Genentech 4CIFAR LMB
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes We include our code in the supplementary material.
Open Datasets Yes In order to empirically validate GMTL, we perform experiments on two datasets called Attributes of People [Bourdev et al., 2011] and Taskonomy [Zamir et al., 2018].
Dataset Splits Yes Attributes of People ... There are 4,013 examples in the training set, and 4,022 in the test set. Since the authors did not specify a validation set, we randomly sample 20% of the training set to use as the validation set. Taskonomy ... with 548,785 examples in the training set, 121,974 in the validation set, and 90,658 in the test set.
Hardware Specification No The paper refers to the supplementary material for details on compute resources and hardware used, but no specific hardware models (e.g., GPU, CPU) are mentioned in the main text.
Software Dependencies No The paper mentions using ResNet-50, Adam optimizer, and ImageNet, but does not specify software dependencies with version numbers (e.g., specific deep learning framework versions, Python versions).
Experiment Setup Yes For all datasets and tasks, we use Res Net-50 [He et al., 2016] pretrained on Image Net [Deng et al., 2009] for the single-task networks. We train using Adam [Kingma and Ba, 2015] with L2 regularization, and tune the learning rate and regularization multiplier. For data augmentation, during training we resize the images to 256 256, randomly crop them to 224 224 and randomly horizontally flip them. During validation and testing, we resize the images to 256 256, and center crop them to 224 224. For all experiments, we train single-task networks from five random initializations.