reproducibilityindex.ai

Variational Multi-Task Learning with Gumbel-Softmax Priors

Authors: Jiayi Shen, Xiantong Zhen, Marcel Worring, Ling Shao

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results demonstrate that the proposed VMTL is able to effectively tackle a variety of challenging multi-task learning settings with limited training data for both classiﬁcation and regression. Our method consistently surpasses previous methods, including strong Bayesian approaches, and achieves state-of-the-art performance on ﬁve benchmark datasets. We conduct extensive experiments to evaluate the proposed VMTL on ﬁve benchmark datasets for both classiﬁcation and regression tasks.
Researcher Affiliation	Collaboration	1AIM Lab, University of Amsterdam, Netherlands 2Inception Institute of Artiﬁcial Intelligence, Abu Dhabi, UAE
Pseudocode	No	The paper does not contain any explicit pseudocode or algorithm blocks.
Open Source Code	Yes	The code will be available at https://github.com/autumn9999/VMTL.git.
Open Datasets	Yes	Ofﬁce-Home [47] contains images from four domains/tasks: Artistic (A), Clipart (C), Product (P) and Real-world (R). Ofﬁce-Caltech [16] contains the ten categories shared between Ofﬁce-31 [39] and Caltech-256 [18]. Image CLEF [33], the benchmark for the Image CLEF domain adaptation challenge, contains 12 common categories shared by four public datasets/tasks: Caltech-256 (C), Image Net ILSVRC 2012 (I), Pascal VOC 2012 (P), and Bing (B). Domain Net [36] is a large-scale dataset with approximately 0.6 million images distributed among 345 categories. Rotated MNIST [29] is adopted for angle regression tasks.
Dataset Splits	Yes	randomly selecting 5%, 10%, and 20% of samples from each task in the dataset as the training set, using the remaining samples as the test set [33]. For the large-scale Domain Net dataset, we set the splits to 1%, 2% and 4%... For the regression dataset, Rotated MNIST, we set the splits to 0.1% and 0.2%...
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions models and optimizers (VGGnet, MLPs, Adam optimizer) but does not provide specific software dependencies with version numbers.
Experiment Setup	Yes	We adopt the Adam optimizer [27] with a learning rate of 1e-4 for training. All the results are obtained based on a 95% conﬁdence interval from ﬁve runs. The temperature of the Gumbel-Softmax priors (6) and (9) is annealed using the same schedule applied in [22]: we start with a high temperature and gradually anneal it to a small but non-zero value. For the KL-divergence in (11), we use the annealing scheme from [6]. L and M are set to 10, which yields good performance while being computationally efﬁcient.