reproducibilityindex.ai

Text Matching as Image Recognition

Authors: Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Shengxian Wan, Xueqi Cheng

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results demonstrate its superiority against the baselines. In this section, we conduct experiments on two tasks, i.e. paraphrase identiﬁcation and paper citation matching, to demonstrate the superiority of Match Pyramid against baselines.
Researcher Affiliation	Academia	CAS Key Laboratory of Network Data Science and Technology, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China {pangliang,wanshengxian}@software.ict.ac.cn, {lanyanyan,guojiafeng,junxu,cxq}@ict.ac.cn
Pseudocode	No	The paper describes the convolutional neural network operations and scoring function using mathematical equations (e.g., Eq. 6-10) but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper refers to publicly released models for baselines (DSSM/CDSSM) but does not provide any statement or link for the source code of their proposed Match Pyramid model.
Open Datasets	Yes	Here we use the benchmark MSRP dataset (Dolan and Brockett 2005), which contains 4076 instances for training and 1725 for testing. The dataset is collected from a commercial academic website. It contains 838 908 instances (text pairs) in total, where there are 279 636 positive (matched) instances and 559 272 negative (mismatch) instances.
Dataset Splits	Yes	Here we use the benchmark MSRP dataset (Dolan and Brockett 2005), which contains 4076 instances for training and 1725 for testing. We split the whole dataset into three parts, 599 196 instances for training, 119 829 for validation and 119 883 for testing.
Hardware Specification	No	The paper mentions that the optimization 'can be easily parallelized on single machine with multi-cores' but does not provide any specific details about the hardware (e.g., CPU, GPU models, memory).
Software Dependencies	No	The paper mentions techniques and libraries like 'Word2Vec', 'Adagrad', and 'Re LU' and cites their original papers, but does not specify any software names with version numbers for implementation details (e.g., Python 3.x, TensorFlow 2.x, PyTorch 1.x).
Experiment Setup	Yes	All these models use two convolutional layers, two max-pooling layers (one of which is a dynamic pooling layer for variable length) and two full connection layers. The number of feature maps is 8 and 16 for the ﬁrst and second convolutional layer, respectively. While the kernel size is set to be 5 5 and 3 3, respectively. We apply stochastic gradient descent method Adagrad (Duchi, Hazan, and Singer 2011) for the optimization of models. It performs better when we use the mini-batch strategy (32 50 in size).