Convolutional Neural Tensor Network Architecture for Community-Based Question Answering

Authors: Xipeng Qiu, Xuanjing Huang

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experimental results shows that our method outperforms the other methods on two matching tasks. We perform extensive empirical studies on two matching tasks, and demonstrate that CNTN is more effective than the other models.
Researcher Affiliation Academia Xipeng Qiu and Xuanjing Huang Shanghai Key Laboratory of Data Science, Fudan University School of Computer Science, Fudan University 825 Zhangheng Road, Shanghai, China xpqiu@fudan.edu.cn, xjhuang@fudan.edu.cn
Pseudocode No The paper describes the model architecture and training process in text and equations, but does not include any formally labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes For initialization of parameters, we use word2vec [Mikolov et al., 2013] to train word embeddings on Wikipedia corpus for English and Chinese respectively.
Dataset Splits Yes We select 10,000 original positive pairs as development set and another 10,000 original positive pairs as test set. The rest QA pairs are used for training. We randomly select 5, 000 QA pairs as the development set, another 5, 000 pairs as the test set.
Hardware Specification No The paper does not specify any details about the hardware (e.g., GPU, CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions software components like word2vec, Ada Grad, SGD, MLP, and CNN, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes The other hyperparameters of our model are set as in Table 2, which are chosen on development datasets in consideration of both accuracy and efficiency. Table 2: Some major hyperparameters of CNTN model: Word embedding size nw = 25, Initial learning rate ρ = 0.1, Regularization λ = 10 4, CNN depth d = 3, Filter width m = 3, Sentence embedding size ns = 50. To minimize the objective, we use stochastic gradient descent(SGD) with the diagonal variant of Ada Grad [Duchi et al., 2011].