Word Embedding as Maximum A Posteriori Estimation
Authors: Shoaib Jameel, Zihao Fu, Bei Shi, Wai Lam, Steven Schockaert6562-6569
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we present a series of experiments in which we compare our model with popular and recent state-ofthe-art word embedding and topic models. Experiments in this work were performed using the ICARUS computational facility from Information Services and the School of Computing Hydra Cluster at the University of Kent. |
| Researcher Affiliation | Collaboration | Shoaib Jameel,1 Zihao Fu,2 Bei Shi,3 Wai Lam,2 Steven Schockaert4 1School of Computing, Medway Campus, University of Kent, UK 2Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong 3Tencent AI Lab, Shenzhen, China 4School of Computer Science and Informatics, Cardiff University, UK |
| Pseudocode | No | The paper describes the model and its optimization using mathematical equations but does not provide pseudocode or algorithm blocks. |
| Open Source Code | Yes | We share our code, pre-processing scripts and datasets online14. 14https://bit.ly/2J5Mt Xj |
| Open Datasets | Yes | Corpora: We have considered the May 2018 dump of the English Wikipedia. First, we considered three analogy datasets: the Google Word Analogy dataset5, the Microsoft Research Syntactic Analogies Dataset (MSR)6, and the BATS 3.0 dataset7. |
| Dataset Splits | Yes | For datasets that have pre-defined tuning and testing splits, we used these standard splits. For the other datasets, we randomly selected 20% as tuning data, and we report results on the remaining 80%. |
| Hardware Specification | Yes | Experiments in this work were performed using the ICARUS computational facility from Information Services and the School of Computing Hydra Cluster at the University of Kent. This experiment was performed on 3.20 GHz machine with 25 threads. |
| Software Dependencies | No | The paper mentions several software tools and implementations used (e.g., Keras, Elasticsearch, VSMLib), but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | The number of dimensions for each model was selected from {50, 100, 300, 400}. For CBOW and SG, we chose the number of negative samples from a pool of {1, 5, 10, 15}. For Glo Ve, we selected the xmax value from {10, 50, 100} and α from {0.1, 0.25, 0.5, 0.75, 1}. The number of iterations for all word embedding models was fixed to 20 and the number of posterior inference iterations for all topic models was fixed to 1000. We also experimented with different learning rate parameters, namely {0.01, 0.001, 0.0001, 0.00001}. |