reproducibilityindex.ai

Learning to Multitask

Authors: Yu Zhang, Ying Wei, Qiang Yang

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on benchmark datasets show the effectiveness of the proposed L2MT framework.
Researcher Affiliation	Collaboration	Yu Zhang1, Ying Wei2, Qiang Yang1 1HKUST 2Tencent AI Lab yu.zhang.ust@gmail.com judywei@tencent.com qyang@cse.ust.hk
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statements or links indicating that open-source code for the methodology is available.
Open Datasets	Yes	Four datasets are used in the experiments, including the MIT-Indoor-Scene, Caltech256, 20newsgroup, and RCV1 datasets. The MIT-Indoor-Scene and Caltech256 datasets are for image classiﬁcation, while the 20newsgroup and RCV1 datasets are for text classiﬁcation.
Dataset Splits	Yes	Each multitask problem Si consists of mi learning tasks each of which is associated with a training dataset, a validation dataset, and a test dataset... we vary the size of training data from 30% to 50% at an interval of 10% with the validation proportion ﬁxed to 30% in the test process
Hardware Specification	No	The paper does not explicitly describe the hardware used to run its experiments, such as specific CPU or GPU models.
Software Dependencies	No	The paper mentions using 'the Adam optimizer in the tensorﬂow package' but does not specify any version numbers for TensorFlow or other software dependencies.
Experiment Setup	Yes	Each entry in {Li} is initialized to be normally distributed with zero mean and variance of 1/100, and the biases {βi} are initialized to be zero. α in the estimation function is initialized to [1, 1, 1, 0.1]T and γ in the link function is initialized to [1, 0]T . The learning rate linearly decays from 0.01 with respect to the number of epoches... when λ is in [0.01, 0.5] and k in [5, 10], the performance is not so sensitive that the choices are easier and hence in experiments we always set λ and k to 0.1 and 6... Based on such observation, ˆd is set to be 50.