reproducibilityindex.ai

Attention-Fused Deep Matching Network for Natural Language Inference

Authors: Chaoqun Duan, Lei Cui, Xinchi Chen, Furu Wei, Conghui Zhu, Tiejun Zhao

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiment results show that AF-DMN achieves state-of-the-art performance and outperforms strong baselines on Stanford natural language inference (SNLI), multigenre natural language inference (Multi NLI), and Quora duplicate questions datasets.
Researcher Affiliation	Collaboration	1 Harbin Institute of Technology, Harbin, China 2 Microsoft Research Asia, Beijing, China 3 School of Computer Science, Fudan University, Shanghai, China
Pseudocode	No	The paper describes the model architecture and mathematical formulations but does not contain a dedicated 'Pseudocode' or 'Algorithm' block.
Open Source Code	No	The paper mentions that the code for ESIM is available at 'https://github.com/lukecq1231/nli', but there is no explicit statement or link providing the open-source code for the AF-DMN methodology described in this paper.
Open Datasets	Yes	We evaluate our model on three datasets: the Stanford Natural Language Inference (SNLI), the Multi Genre NLI Corpus (Multi NLI) and Quora duplicate questions1 (Quora). SNLI The SNLI corpus [Bowman et al., 2015] contains 570,152 sentence pairs. Multi NLI The Multi NLI corpus [Williams et al., 2017] is a new dataset for NLI, which contains 433k sentences pairs. Quora The Quora corpus contains over 400,000 question pairs. 1https://data.quora.com/First-Quora-Dataset-Release-QuestionPairs
Dataset Splits	Yes	Train Dev Test Avg.L Vocab SNLI 549K 9.8K 9.8K 14 8 36K Multi NLI1 392K 9.8K 9.8K 22 11 85K Multi NLI2 9.8K 9.8K 22 11 85K Quora 384K 10K 10K 12 12 107K
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions using Adam as an optimizer and GloVe vectors for initialization but does not provide specific version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup	Yes	In our model, word embeddings and all hidden states of LSTMs and MLPs are 300 dimensions. For the SNLI dataset, there are 3 computational blocks in the deep matching layer, while there are 2 for Multi NLI and Quora datasets. We employ the Adam [Kingma and Ba, 2014] for training, whose default hyper-parameters β1 and β2 are set to 0.9 and 0.999 for optimization respectively. The initial learning rate of Adam is set to 0.0002. The learning rate is halved when the accuracy on the development set drops. We also employ a dropout strategy [Srivastava et al., 2014] on word embeddings and all MLPs to avoid over-ﬁtting. The dropout rate is set to 0.2. The batch size is set to 64. We set the maximum length of sentences to 200.