reproducibilityindex.ai

Learning Word Representation Considering Proximity and Ambiguity

Authors: Lin Qiu, Yong Cao, Zaiqing Nie, Yong Yu, Yong Rui

AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments on two text corpora created from Wikipedia documents. The big one contains 1.6 billion words from all Wikipedia documents while the small one contains 42 million words from a random sample of Wikipedia documents. We use the test set proposed in (Mikolov et al. 2013a) to measure the quality of the learned representation vectors.
Researcher Affiliation	Collaboration	Shanghai Jiao Tong University, {lqiu, yyu}@apex.sjtu.edu.cn Microsoft Research, {yongc, znie, yongrui}@microsoft.com
Pseudocode	No	The paper provides architectural diagrams and mathematical formulations but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any concrete access to source code for the described methodology.
Open Datasets	Yes	We train a CRF POS tagger on the Wall Street Journal data from Penn Treebank III (Marcus, Marcinkiewicz, and Santorini 1993).
Dataset Splits	No	The paper mentions using a test set from another paper and training on Wikipedia documents, but it does not explicitly specify training, validation, and test splits for its own corpora.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments.
Software Dependencies	No	The paper mentions using a CRF POS tagger and various algorithms (e.g., SGD, back-propagation), but it does not list specific software or library names with version numbers.
Experiment Setup	Yes	We conduct experiments on two text corpora created from Wikipedia documents. Three different vector dimensions (50, 300 and 600) are employed in the experiments. The initial values of the proximity weights are 1 s instead of random values. The training of each model lasts 3 epoches. We employ Medium-Grained POS in both models. The window size is set to 31.