Meta Multi-Task Learning for Sequence Modeling

Authors: Junkun Chen, Xipeng Qiu, Pengfei Liu, Xuanjing Huang

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments on two types of tasks, text classification and sequence tagging, which demonstrate the benefits of our approach. Experiment result Table 3 shows the classification accuracies on the tasks of product reviews. Experiment result Table 5 shows the accuracies or F1 scores on the sequence tagging datasets of our models, compared to some state-of-the-art results.
Researcher Affiliation Academia Junkun Chen, Xipeng Qiu, Pengfei Liu, Xuanjing Huang Shanghai Key Laboratory of Intelligent Information Processing, Fudan University School of Computer Science, Fudan University 825 Zhangheng Road, Shanghai, China {jkchen16, xpqiu, pfliu14, xjhuang}@fudan.edu.cn
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide any statement or link for open-source code availability related to the described methodology.
Open Datasets Yes For classification task, we test our model on 16 classification datasets, the first 14 datasets are product reviews that collected based on the dataset1, constructed by Blitzer et al. (2007)...1https://www.cs.jhu.edu/ mdredze/datasets/sentiment/ IMDB The movie reviews2 with labels of subjective or objective (Maas et al. 2011).2https://www.cs.jhu.edu/ mdredze/datasets/sentiment/ unprocessed.tar.gz MR The movie reviews3 with two classes (Pang and Lee 2005).3https://www.cs.cornell.edu/people/pabo/movie-review-data/.
Dataset Splits Yes The datasets in each domain are partitioned randomly into training data, development data and testing data with the proportion of 70%, 10% and 20% respectively.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions 'backpropagation and the gradient-based optimization is performed using the Adagrad update rule' but does not specify any software libraries or their version numbers.
Experiment Setup Yes The final hyper-parameters are set as Table 2. Table 2: Hyper-parameters of our models. (Lists Embedding dimension: d 200, Size of h in Basic-LSTM: h 100, Size of ˆh in Meta-LSTM: m 40, Size of meta vector z: z 40, Initial learning rate 0.1, Regularization 1E 5)