reproducibilityindex.ai

Complementary Learning of Word Embeddings

Authors: Yan Song, Shuming Shi

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results indicate that our approach can effectively improve the quality of initial embeddings, in terms of intrinsic and extrinsic evaluations.
Researcher Affiliation	Industry	Yan Song, Shuming Shi Tencent AI Lab {clksong, shumingshi}@tencent.com
Pseudocode	Yes	Algorithm 1: Complementary learning of word embeddings using CB and SG.
Open Source Code	No	The paper does not provide explicit information or a link to open-source code for the described methodology.
Open Datasets	Yes	We prepare the latest dump of Wikipedia articles3 as the base corpus for training word embeddings, which contains approximately 2 billion word tokens.3https://dumps.wikimedia.org/enwiki/latest/. We use the MEN-3k [Bruni et al., 2012], Simlex-999 [Hill et al., 2015] and WS-353 [Finkelstein et al., 2002] data sets...The extrinsic evaluation is conducted on text classiﬁcation with four datasets: the 20Newsgroups (20NG)4 for topic classiﬁcation, ATIS [Hemphill et al., 1990] for intent classiﬁcation, TREC [Li and Roth, 2002] for question type classiﬁcation and IMDB [Maas et al., 2011] for sentiment classiﬁcation.4The bydate version on the web site: http://qwone.com/~jason/20Newsgroups/
Dataset Splits	Yes	All datasets are organized following their standard split.
Hardware Specification	No	The paper does not provide specific details regarding the hardware used for the experiments (e.g., GPU/CPU models, memory).
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	All baseline and our embedding models are trained with the same hyper-parameters, i.e., 200 dimensions, 5 as the word frequency cutoff, a windows size of 5 words, 2 4 iterations, using hierarchical softmax as learning strategy. ... discount learning rates γ1 and γ2 are required as input. ... hyper-parameter λ adjusting the contribution of different sub-rewards.