reproducibilityindex.ai

Learning Word Representations from Relational Graphs

Authors: Danushka Bollegala, Takanori Maehara, Yuichi Yoshida, Ken-ichi Kawarabayashi

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate the accuracy of the word representations learnt using the proposed method, we use the learnt word representations to solve semantic word analogy problems. Our experimental results show that it is possible to learn better word representations by using semantic semantics between words.
Researcher Affiliation	Collaboration	Danushka Bollegala Takanori Maehara Yuichi Yoshida Ken-ichi Kawarabayashi The University of Liverpool, Liverpool, L69 3BX, United Kingdom. National Institute of Informatics, 2-1-2, Hitotsubashi, Chiyoda-ku, Tokyo, 101-8430, Japan. JST, ERATO, Kawarabayashi Large Graph Project. Preferred Infrastructure, Inc., Hongo 2-40-1, Bunkyo-ku, Tokyo 113-0033, Japan.
Pseudocode	Yes	Algorithm 1 Learning word representations. Input: Relational graph G, dimensionality d of the word representations, maximum epochs T, initial learning rate η0. Output: Word representations x(u) for words u V.
Open Source Code	No	The paper does not provide a specific link or an unambiguous statement about releasing the source code for the methodology described in this paper.
Open Datasets	Yes	We use the English uk Wa C1 corpus in our experiments. uk Wa C is a 2 billion token corpus constructed from the Web limiting the crawl to .uk domain and medium-frequency words from the British National Corpus (BNC). The corpus is lemmatised and Part-Of-Speech tagged using the Tree Tagger2. Moreover, Malt Parser3 is used to create a dependency parsed version of the uk Wa C corpus. 1http://wacky.sslmit.unibo.it/doku.php?id=corpora
Dataset Splits	No	The paper does not explicitly provide specific training/test/validation dataset splits, nor does it mention a validation set. It specifies training with 100 iterations.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments, only general statements about data processing tools.
Software Dependencies	No	The paper mentions "Tree Tagger" and "Malt Parser" as tools used for corpus processing, with URLs in footnotes. However, it does not provide specific version numbers for these tools or any other software dependencies crucial for replicating the proposed learning algorithm (e.g., Python, machine learning libraries with versions).
Experiment Setup	Yes	To evaluate the performance of the proposed method on relational graphs created using different pattern types and cooccurrence measures, we train 200 dimensional word representations (d = 200) using Algorithm 1. 100 iterations (T = 100) was sufﬁcient to obtain convergence in all our experiments. The initial learning rate, η0 is set to 0.0001 in our experiments. Alternatively, without constraining G(l) to diagonal matrices, we numerically guarantee the positive semideﬁniteness of G(l) by adding a small noise term δI after each update to G(l), where I is the d d identity matrix and δ R is a small perturbation coefﬁcient, which we set to 0.001 in our experiments.