reproducibilityindex.ai

Bayesian Neural Word Embedding

Authors: Oren Barkan

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present experimental results that demonstrate the performance of the proposed algorithm for word analogy and similarity tasks on six different datasets and show it is competitive with the original Skip-Gram method.
Researcher Affiliation	Collaboration	Oren Barkan Tel Aviv University, Israel Microsoft, Israel
Pseudocode	Yes	The algorithm is described in Fig. 1 and includes three main stages.
Open Source Code	No	The paper references the word2vec implementation's URL (ŚƚƚƉƐ ĐŽĚĞ ŐŽŽŐůĞ ĐŽŵ Ɖ ǁŽƌĚϮǀĞĐ) which is a third-party tool, but it does not provide a link or explicit statement for the open-sourcing of the authors' own Bayesian Skip-Gram (BSG) methodology code.
Open Datasets	Yes	We trained both models on the corpus from (Chelba et al. 2014).
Dataset Splits	No	The paper describes the corpus used for training and testing/evaluation datasets. However, it does not specify explicit training/validation/test splits (e.g., percentages or counts for each subset) of the main training corpus, nor does it mention cross-validation.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments (e.g., CPU, GPU models, memory, or cloud instance types).
Software Dependencies	No	The paper mentions using the "ǁŽƌĚϮǀĞĐ implementation" for SG, but it does not specify any software dependencies with version numbers for either SG or the proposed BSG method.
Experiment Setup	Yes	Specifically, we set the target representation dimension m = 40, maximal window size max c = 4, subsampling parameter ρ = 5 10, vocabulary size l = 30000 and negative to positive ratio N = 1. For BSG, we further set τ = 1, κ = 10 and γ = 0.7 (note that BSG is quite robust to the choice of γ as long as 0.5 1 γ < ). Both models were trained for K = 40 iterations (we verified their convergence after ~30 iterations).