reproducibilityindex.ai

Using k-Way Co-Occurrences for Learning Word Embeddings

Authors: Danushka Bollegala, Yuichi Yoshida, Ken-ichi Kawarabayashi

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results show that the derived theoretical relationship does indeed hold empirically, and despite data sparsity, for some smaller k( 5) values, k-way embeddings perform comparably or better than 2-way embeddings in a range of tasks. We evaluate the word embeddings created from k-way cooccurrences on multiple benchmark datasets for semantic similarity measurement, analogy detection, relation classiﬁcation, and short-text classiﬁcation ( 5.2).
Researcher Affiliation	Collaboration	Danushka Bollegala,1 Yuichi Yoshida,2 Ken-ichi Kawarabayashi2,3 University of Liverpool, Liverpool, L693BX, United Kingdom1 National Institute of Informatics, 2-1-2, Hitotsubashi, Chiyoda-ku, Tokyo, 101-8430, Japan2 Japan Science and Technology Agency, ERATO, Kawarabayashi Large Graph Project3
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described in this paper.
Open Datasets	Yes	We pre-processed a January 2017 dump of English Wikipedia using a Perl script1 and used as our corpus (contains ca. 4.6B tokens). 1http://mattmahoney.net/dc/textdata.html
Dataset Splits	No	The paper mentions using training and test portions for short-text classification, but does not provide specific details on dataset splits (e.g., percentages or counts) for reproduction across all experiments, especially for the main word embedding training.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions 'Perl script', 'Ada Grad', and 'SGD' but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	The initial learning rate is set to 0.01 in all experiments. Downweighting very frequent co-occurrences of words has shown to be effective in prior work. This can be easily incorporated into the objective function (5) by replacing h(wk 1) by a truncated version such as min(h(wk 1), θk), where θ is a cut-off threshold, where we set θ = 100 following prior work.