An Adaptive Hierarchical Compositional Model for Phrase Embedding
Authors: Bing Li, Xiaochun Yang, Bin Wang, Wei Wang, Wei Cui, Xianchao Zhang
IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental evaluation demonstrates that our model outperforms state-of-the-art methods in both similarity tasks and analogy tasks. |
| Researcher Affiliation | Academia | School of Computer Science and Engineering, Northeastern University, China University of New South Wales, Australia Dongguan University of Technology, China College of Electrical Engineering and Automation, Shandong University of Science and Technology + School of Software, Dalian University of Technology, China |
| Pseudocode | No | The paper describes algorithmic steps using text and equations, but does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | Yes | We used the following three training corpora: Text8 (https://cs.fit.edu/ mmahoney/compression/textdata.html), Google News (http://www.statmt.org/wmt14/training-monolingual-newscrawl/news.2012.en.shuffled.gz), Wiki (https://dumps.wikimedia.org/). |
| Dataset Splits | No | The paper describes hyperparameters and training iterations but does not provide specific training/validation/test dataset splits for its main training corpora (Text8, Google News, Wiki). It uses separate evaluation datasets. |
| Hardware Specification | No | The paper does not provide specific hardware details used for running its experiments. |
| Software Dependencies | No | The paper mentions using Skip-Gram architecture and Negative Sampling technique but does not provide specific ancillary software details with version numbers. |
| Experiment Setup | Yes | To be specific, we set the number of negative examples to be 25, and iterations (number of epochs) to be 5. The initial learning rate of Skip Gram model were set to 0.05. We set the dimension of vector d = 200, unless noted otherwise. We set context window length to be 10 and sub-sampling rate 1e 5... Thus, we randomly initialized β by a Gaussian distribution N(0.5, 1). |