Training and Evaluating Improved Dependency-Based Word Embeddings

Authors: Chen Li, Jianxin Li, Yangqiu Song, Ziwei Lin

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate and analyze our proposed approach using several direct and indirect tasks for word embeddings. Experimental results demonstrate that our embeddings are competitive to or better than state-of-the-art methods and significantly outperform other methods in terms of context stability.
Researcher Affiliation Academia Chen Li, Jianxin Li, Yangqiu Song, Ziwei Lin Department of Computer Science & Engineering, Beihang University, Beijing 100191, China Department of Computer Science & Engineering, Hong Kong University of Science and Technology, Hong Kong
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code Yes Our system is publicly available at https://github. com/Ring BDStack/dependency-based-w2v.
Open Datasets No We trained all embeddings based on partial English Wikipdeia corpus, which contains 388,900,648 tokens and 555,434 unique words. The version of download file is wikidata-20161020. The paper does not provide a direct link or formal citation for public access to the specific processed corpus used.
Dataset Splits No The paper mentions using standard benchmark datasets for evaluation, but it does not explicitly provide details about training/validation/test splits for its main Wikipedia corpus used for training embeddings, nor does it specify validation splits for the downstream tasks' datasets if distinct from standard test splits.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models) used for running the experiments.
Software Dependencies No The paper mentions 'Word2Vec tool' and 'Stanford neural-network dependency parser', but does not provide specific version numbers for software dependencies.
Experiment Setup Yes As we found that various dimensions (50, 300, 600, 1000) of word embeddings resulted in similar trends, only experimental results for 300 dimension embeddings will be reported. Meanwhile, we set the dimension of dependency vector v(dwi,k 1,wi,k) as 50 5, the initial dependency weight ϕdwi,k 1,wi,k = 0.9, and initialize word vector v(w), positive dependency vector v(d) and other model parameters randomly. ... we dynamically adjust the context window size of target word w as follows: cw = max (sizemax log fw, sizemin)...