reproducibilityindex.ai

Lifelong Domain Word Embedding via Meta-Learning

Authors: Hu Xu, Bing Liu, Lei Shu, Philip S. Yu

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that domain embeddings produced from such a process improve the performance of the downstream tasks. We use the Amazon Review datasets from [He and Mc Auley, 2016], which is a collection of multiple-domain corpora. Table 2 shows the main results. We observe that the proposed method L-DEM 200D + ND 30M performs the best.
Researcher Affiliation	Academia	1Department of Computer Science, University of Illinois at Chicago, Chicago, IL, USA 2Institute for Data Science, Tsinghua University, Beijing, China {hxu48, liub, lshu3, psyu}@uic.edu
Pseudocode	Yes	Algorithm 1: Identifying Context Words from the Past
Open Source Code	No	The paper does not contain an explicit statement about releasing source code for the methodology described, nor does it provide a direct link to a code repository.
Open Datasets	Yes	We use the Amazon Review datasets from [He and Mc Auley, 2016], which is a collection of multiple-domain corpora.
Dataset Splits	Yes	We split the 56 domains into 39 domains for training, 5 domains for validation and 12 domains for testing. We select 3500 examples for training, 500 examples for validation and 2000 examples for testing.
Hardware Specification	No	The paper does not specify any particular hardware (e.g., CPU, GPU models, memory) used for the experiments. It only mentions "limited computing resource".
Software Dependencies	No	The paper mentions software components like "skip-gram model", "Adam optimizer", and "Bi-LSTM model" but does not provide specific version numbers for any of the software or libraries used in implementation.
Experiment Setup	Yes	We set the size of a context window to be 5 when building feature vectors. We use the default hyperparameters of skip-gram model [Mikolov et al., 2013b] to train the domain embeddings. We apply dropout rate of 0.5 on all layers except the last one and use Adam [Kingma and Ba, 2014] as the optimizer. we empirically set δ = 0.7 as the threshold on the similarity score in Algorithm 1