Joint Learning of Character and Word Embeddings

Authors: Xinxiong Chen, Lei Xu, Zhiyuan Liu, Maosong Sun, Huanbo Luan

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the effectiveness of CWE on word relatedness computation and analogical reasoning. The results show that CWE outperforms other baseline methods which ignore internal character information.
Researcher Affiliation Academia 1 Department of Computer Science and Technology, State Key Lab on Intelligent Technology and Systems, National Lab for Information Science and Technology, Tsinghua University, Beijing, China 2 Jiangsu Collaborative Innovation Center for Language Ability, Jiangsu Normal University, Xuzhou 221009 China
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes The codes and data can be accessed from https://github.com/ Leonard-Xu/CWE.
Open Datasets No We select a human-annotated corpus with news articles from The People s Daily for embedding learning. The corpus has 31 million words. The paper mentions the training corpus but does not provide a direct link, DOI, or formal citation (with authors and year) for its public availability. The provided GitHub link specifically refers to the analogical reasoning dataset and word similarity datasets, not the main 31 million word training corpus.
Dataset Splits No The paper describes the total size of its learning corpus (31 million words) and the sizes of its evaluation datasets (wordsim-240, wordsim-296, and a manually built analogical reasoning dataset), but it does not specify explicit train/validation/test splits of the main training corpus.
Hardware Specification No The paper does not provide specific hardware details (such as exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details, such as library or solver names with version numbers.
Experiment Setup Yes We set vector dimension as 200 and context window size as 5. For optimization, we use both hierarchical softmax and 10-word negative sampling.