Meta Multi-Task Learning for Sequence Modeling
Authors: Junkun Chen, Xipeng Qiu, Pengfei Liu, Xuanjing Huang
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on two types of tasks, text classification and sequence tagging, which demonstrate the benefits of our approach. Experiment result Table 3 shows the classification accuracies on the tasks of product reviews. Experiment result Table 5 shows the accuracies or F1 scores on the sequence tagging datasets of our models, compared to some state-of-the-art results. |
| Researcher Affiliation | Academia | Junkun Chen, Xipeng Qiu, Pengfei Liu, Xuanjing Huang Shanghai Key Laboratory of Intelligent Information Processing, Fudan University School of Computer Science, Fudan University 825 Zhangheng Road, Shanghai, China {jkchen16, xpqiu, pfliu14, xjhuang}@fudan.edu.cn |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link for open-source code availability related to the described methodology. |
| Open Datasets | Yes | For classification task, we test our model on 16 classification datasets, the first 14 datasets are product reviews that collected based on the dataset1, constructed by Blitzer et al. (2007)...1https://www.cs.jhu.edu/ mdredze/datasets/sentiment/ IMDB The movie reviews2 with labels of subjective or objective (Maas et al. 2011).2https://www.cs.jhu.edu/ mdredze/datasets/sentiment/ unprocessed.tar.gz MR The movie reviews3 with two classes (Pang and Lee 2005).3https://www.cs.cornell.edu/people/pabo/movie-review-data/. |
| Dataset Splits | Yes | The datasets in each domain are partitioned randomly into training data, development data and testing data with the proportion of 70%, 10% and 20% respectively. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'backpropagation and the gradient-based optimization is performed using the Adagrad update rule' but does not specify any software libraries or their version numbers. |
| Experiment Setup | Yes | The final hyper-parameters are set as Table 2. Table 2: Hyper-parameters of our models. (Lists Embedding dimension: d 200, Size of h in Basic-LSTM: h 100, Size of ˆh in Meta-LSTM: m 40, Size of meta vector z: z 40, Initial learning rate 0.1, Regularization 1E 5) |