Neural Word Embedding as Implicit Matrix Factorization

Authors: Omer Levy, Yoav Goldberg

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the word representations on four dataset, covering word similarity and relational analogy tasks. We used two datasets to evaluate pairwise word similarity: Finkelstein et al.’s Word Sim353 [13] and Bruni et al.’s MEN [4]. These datasets contain word pairs together with human-assigned similarity scores.
Researcher Affiliation Academia Omer Levy Department of Computer Science Bar-Ilan University omerlevy@gmail.com Yoav Goldberg Department of Computer Science Bar-Ilan University yoav.goldberg@gmail.com
Pseudocode No The paper describes methods like SGNS and SVD but does not provide them in a structured pseudocode or algorithm block.
Open Source Code Yes To train the SGNS models, we used a modified version of word2vec which receives a sequence of pre-extracted word-context pairs [18].4 ... 4http://www.bitbucket.org/yoavgo/word2vecf
Open Datasets Yes All models were trained on English Wikipedia, pre-processed by removing non-textual elements, sentence splitting, and tokenization. The corpus contains 77.5 million sentences, spanning 1.5 billion tokens.
Dataset Splits No No explicit training/validation/test dataset splits (e.g., percentages, sample counts, or cross-validation setup) are provided for the English Wikipedia corpus used for training.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments are mentioned.
Software Dependencies No The paper mentions using 'a modified version of word2vec' but does not provide specific version numbers for this or any other software dependencies.
Experiment Setup Yes All models were derived using a window of 2 tokens to each side of the focus word, ignoring words that appeared less than 100 times in the corpus, resulting in vocabularies of 189,533 terms for both words and contexts. ... We experimented with three values of k (number of negative samples in SGNS, shift parameter in PMI-based methods): 1, 5, 15.