Word Embedding as Maximum A Posteriori Estimation

Authors: Shoaib Jameel, Zihao Fu, Bei Shi, Wai Lam, Steven Schockaert6562-6569

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we present a series of experiments in which we compare our model with popular and recent state-ofthe-art word embedding and topic models. Experiments in this work were performed using the ICARUS computational facility from Information Services and the School of Computing Hydra Cluster at the University of Kent.
Researcher Affiliation Collaboration Shoaib Jameel,1 Zihao Fu,2 Bei Shi,3 Wai Lam,2 Steven Schockaert4 1School of Computing, Medway Campus, University of Kent, UK 2Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong 3Tencent AI Lab, Shenzhen, China 4School of Computer Science and Informatics, Cardiff University, UK
Pseudocode No The paper describes the model and its optimization using mathematical equations but does not provide pseudocode or algorithm blocks.
Open Source Code Yes We share our code, pre-processing scripts and datasets online14. 14https://bit.ly/2J5Mt Xj
Open Datasets Yes Corpora: We have considered the May 2018 dump of the English Wikipedia. First, we considered three analogy datasets: the Google Word Analogy dataset5, the Microsoft Research Syntactic Analogies Dataset (MSR)6, and the BATS 3.0 dataset7.
Dataset Splits Yes For datasets that have pre-defined tuning and testing splits, we used these standard splits. For the other datasets, we randomly selected 20% as tuning data, and we report results on the remaining 80%.
Hardware Specification Yes Experiments in this work were performed using the ICARUS computational facility from Information Services and the School of Computing Hydra Cluster at the University of Kent. This experiment was performed on 3.20 GHz machine with 25 threads.
Software Dependencies No The paper mentions several software tools and implementations used (e.g., Keras, Elasticsearch, VSMLib), but does not provide specific version numbers for these software dependencies.
Experiment Setup Yes The number of dimensions for each model was selected from {50, 100, 300, 400}. For CBOW and SG, we chose the number of negative samples from a pool of {1, 5, 10, 15}. For Glo Ve, we selected the xmax value from {10, 50, 100} and α from {0.1, 0.25, 0.5, 0.75, 1}. The number of iterations for all word embedding models was fixed to 20 and the number of posterior inference iterations for all topic models was fixed to 1000. We also experimented with different learning rate parameters, namely {0.01, 0.001, 0.0001, 0.00001}.