Neural Bag-of-Ngrams

Authors: Bofang Li, Tao Liu, Zhe Zhao, Puwei Wang, Xiaoyong Du

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform qualitative evaluation on IMDB dataset (Table 2), and quantitative evaluation on text classification task (7 datasets) and semantic relatedness task (2 datasets with 7 categories).
Researcher Affiliation Academia Bofang Li, Tao Liu, Zhe Zhao, Puwei Wang, Xiaoyong Du School of Information, Renmin University of China, Beijing, China Key laboratory of Data Engineering and Knowledge Engineering, MOE, Beijing, China {libofang, tliu, helloworld, wangpuwei, duyong}@ruc.edu.cn
Pseudocode No The paper describes methods textually and mathematically but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes The source code of Neural-Bo N is published at https://github. com/libofang/Neural-Bo N.
Open Datasets Yes For text classification task, hyper-parameters are tuned on 20% of the training data from IMDB dataset (Maas et al. 2011). For semantic relatedness task, hyper-parameters are tuned on the development data from SICK dataset (Marelli et al. 2014). Similar to previous researches, Toronto Books Corpus is used as training data.
Dataset Splits Yes For text classification task, hyper-parameters are tuned on 20% of the training data from IMDB dataset (Maas et al. 2011). For semantic relatedness task, hyper-parameters are tuned on the development data from SICK dataset (Marelli et al. 2014).
Hardware Specification Yes Table 3: Approximate training time of models for a single epoch on one million words. CPU: Intel Xeon E5-2670 (32core). GPU: NVIDIA Tesla K40.
Software Dependencies No The paper mentions techniques like 'Negative Sampling', 'stochastic gradient descent', and 'backpropagation', but does not list any specific software or library names with version numbers used for implementation.
Experiment Setup Yes Optimal hyper-parameters are actually identical: the vector dimension is 500, the learning rate is fixed to 0.25, the negative sampling size is 5, and models are trained for 10 iteration.