Lifelong Domain Word Embedding via Meta-Learning
Authors: Hu Xu, Bing Liu, Lei Shu, Philip S. Yu
IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that domain embeddings produced from such a process improve the performance of the downstream tasks. We use the Amazon Review datasets from [He and Mc Auley, 2016], which is a collection of multiple-domain corpora. Table 2 shows the main results. We observe that the proposed method L-DEM 200D + ND 30M performs the best. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of Illinois at Chicago, Chicago, IL, USA 2Institute for Data Science, Tsinghua University, Beijing, China {hxu48, liub, lshu3, psyu}@uic.edu |
| Pseudocode | Yes | Algorithm 1: Identifying Context Words from the Past |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code for the methodology described, nor does it provide a direct link to a code repository. |
| Open Datasets | Yes | We use the Amazon Review datasets from [He and Mc Auley, 2016], which is a collection of multiple-domain corpora. |
| Dataset Splits | Yes | We split the 56 domains into 39 domains for training, 5 domains for validation and 12 domains for testing. We select 3500 examples for training, 500 examples for validation and 2000 examples for testing. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., CPU, GPU models, memory) used for the experiments. It only mentions "limited computing resource". |
| Software Dependencies | No | The paper mentions software components like "skip-gram model", "Adam optimizer", and "Bi-LSTM model" but does not provide specific version numbers for any of the software or libraries used in implementation. |
| Experiment Setup | Yes | We set the size of a context window to be 5 when building feature vectors. We use the default hyperparameters of skip-gram model [Mikolov et al., 2013b] to train the domain embeddings. We apply dropout rate of 0.5 on all layers except the last one and use Adam [Kingma and Ba, 2014] as the optimizer. we empirically set δ = 0.7 as the threshold on the similarity score in Algorithm 1 |