Learning Word Representations from Relational Graphs

Authors: Danushka Bollegala, Takanori Maehara, Yuichi Yoshida, Ken-ichi Kawarabayashi

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To evaluate the accuracy of the word representations learnt using the proposed method, we use the learnt word representations to solve semantic word analogy problems. Our experimental results show that it is possible to learn better word representations by using semantic semantics between words.
Researcher Affiliation Collaboration Danushka Bollegala Takanori Maehara Yuichi Yoshida Ken-ichi Kawarabayashi The University of Liverpool, Liverpool, L69 3BX, United Kingdom. National Institute of Informatics, 2-1-2, Hitotsubashi, Chiyoda-ku, Tokyo, 101-8430, Japan. JST, ERATO, Kawarabayashi Large Graph Project. Preferred Infrastructure, Inc., Hongo 2-40-1, Bunkyo-ku, Tokyo 113-0033, Japan.
Pseudocode Yes Algorithm 1 Learning word representations. Input: Relational graph G, dimensionality d of the word representations, maximum epochs T, initial learning rate η0. Output: Word representations x(u) for words u V.
Open Source Code No The paper does not provide a specific link or an unambiguous statement about releasing the source code for the methodology described in this paper.
Open Datasets Yes We use the English uk Wa C1 corpus in our experiments. uk Wa C is a 2 billion token corpus constructed from the Web limiting the crawl to .uk domain and medium-frequency words from the British National Corpus (BNC). The corpus is lemmatised and Part-Of-Speech tagged using the Tree Tagger2. Moreover, Malt Parser3 is used to create a dependency parsed version of the uk Wa C corpus. 1http://wacky.sslmit.unibo.it/doku.php?id=corpora
Dataset Splits No The paper does not explicitly provide specific training/test/validation dataset splits, nor does it mention a validation set. It specifies training with 100 iterations.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments, only general statements about data processing tools.
Software Dependencies No The paper mentions "Tree Tagger" and "Malt Parser" as tools used for corpus processing, with URLs in footnotes. However, it does not provide specific version numbers for these tools or any other software dependencies crucial for replicating the proposed learning algorithm (e.g., Python, machine learning libraries with versions).
Experiment Setup Yes To evaluate the performance of the proposed method on relational graphs created using different pattern types and cooccurrence measures, we train 200 dimensional word representations (d = 200) using Algorithm 1. 100 iterations (T = 100) was sufficient to obtain convergence in all our experiments. The initial learning rate, η0 is set to 0.0001 in our experiments. Alternatively, without constraining G(l) to diagonal matrices, we numerically guarantee the positive semidefiniteness of G(l) by adding a small noise term δI after each update to G(l), where I is the d d identity matrix and δ R is a small perturbation coefficient, which we set to 0.001 in our experiments.