reproducibilityindex.ai

Approximating Word Ranking and Negative Sampling for Word Embedding

Authors: Guibing Guo, Shichang Ouyang, Fajie Yuan, Xingwei Wang

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical experiments show that Opt Rank consistently outperforms its counterparts on a benchmark dataset with different sampling scales, especially when the sampled subset is small. The code and datasets can be obtained from https : //github.com/ouououououou/Opt Rank
Researcher Affiliation	Academia	Northeastern University, China University of Glasgow, UK
Pseudocode	Yes	Algorithm 1: The Opt Rank learning algorithm
Open Source Code	Yes	The code and datasets can be obtained from https : //github.com/ouououououou/Opt Rank
Open Datasets	Yes	The training dataset used in our experiments is the Wikipedia 2017 articles (Wiki2017)2, which contains around 2.3 billion words (14G). 2http://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz2
Dataset Splits	No	The paper mentions training on Wikipedia 2017 articles and testing on various benchmark datasets (word analogy, word similarity datasets), but does not explicitly describe a validation set or specific train/validation/test splits from the primary training data.
Hardware Specification	No	The paper does not specify any hardware details such as GPU/CPU models, memory, or specific computing environments used for running the experiments.
Software Dependencies	No	The paper describes parameter settings for the models but does not provide specific software dependencies with version numbers (e.g., Python, TensorFlow, PyTorch versions).
Experiment Setup	Yes	For CBOW-p, CBOW-a and Opt Rank models, as suggested by [Mikolov et al., 2013; Chen et al., 2017], down-sampled rate is set to 0.001; the learning rate starts with a = 0.025 and changes by at = a(1 t/T), where T is the sample size and t is the iteration of current training examples. Besides, window size = 8, dimension = 300, and the size of negative sample is 15 in ﬁve subsets, and 2 in the whole Wiki2017 dataset, respectively. For the parameter power used in negative sampling, we ﬁnd that power = 0.75 offers the best accuracy for CBOW-p and Opt Rank model, while power = 0.005 is suggested by [Chen et al., 2017] and adopted for CBOW-a. Specially, the value of ε in Opt Rank should be adjust to the size of the corpus. We set ε as 0.5 in ﬁve subsets and 1.0 in Wiki2017(14G). For the Word Rank model, we adopt the settings given by [Ji et al., 2015]: logarithm as the objective function, initial value of scale parameter is α = 100 and offset parameter β = 99. The dimension of word vectors is also set to 300.