Multi-Channel Reverse Dictionary Model

Authors: Lei Zhang, Fanchao Qi, Zhiyuan Liu, Yasheng Wang, Qun Liu, Maosong Sun312-319

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our model on English and Chinese datasets including both dictionary definitions and human-written descriptions. Experimental results show that our model achieves the state-of-the-art performance, and even outperforms the most popular commercial reverse dictionary system on the human-written description dataset. We also conduct quantitative analyses and a case study to demonstrate the effectiveness and robustness of our model.
Researcher Affiliation Collaboration Lei Zhang,2 Fanchao Qi,1,2 Zhiyuan Liu,1,2 Yasheng Wang,3 Qun Liu,3 Maosong Sun1,2 1Department of Computer Science and Technology, Tsinghua University 2Institute for Artificial Intelligence, Tsinghua University Beijing National Research Center for Information Science and Technology 3Huawei Noah s Ark Lab zhanglei9003@gmail.com, qfc17@mails.tsinghua.edu.cn {liuzy, sms}@tsinghua.edu.cn, {wangyasheng, qun.liu}@huawei.com
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes All the code and data of this work can be obtained on https://github.com/thunlp/MultiRD.
Open Datasets Yes We use the English dictionary definition dataset created by Hill et al. (2016)2 as the training set. It contains about 100, 000 words and 900, 000 word-definition pairs.
Dataset Splits No The paper describes training and test sets but does not explicitly mention a validation set split or its details.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions using 'word2vec' for word embeddings but does not specify its version or any other software dependencies with version numbers.
Experiment Setup Yes For our model, the dimension of non-directional hidden states is 300 2, the weights of different channels are equally set to 1, and the dropout rate is 0.5. For training, we adopt Adam as the optimizer with initial learning rate 0.001, and the batch size is 128.