reproducibilityindex.ai

Hermitian Co-Attention Networks for Text Matching in Asymmetrical Domains

Authors: Yi Tay, Anh Tuan Luu, Siu Cheung Hui

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on ﬁve text matching benchmark datasets demonstrate the effectiveness of our approach.
Researcher Affiliation	Collaboration	Yi Tay1, Anh Tuan Luu2, Siu Cheung Hui3 1,3 Nanyang Technological University, Singapore 2 Institute for Infocomm Research, A*Star, Singapore
Pseudocode	No	The paper provides a diagram of the model architecture (Figure 1) and describes the various components and their mathematical operations in Section 3. However, it does not include a dedicated pseudocode block or algorithm section.
Open Source Code	No	The paper does not provide any explicit statement about releasing source code for the methodology, nor does it include a link to a code repository.
Open Datasets	Yes	We use the Sci Tail dataset [Khot et al., 2018]... We evaluate our proposed approach on two popular and widely adopted benchmarks for retrieval-based QA, i.e., Wiki QA [Yang et al., 2015] and Trec QA [Wang et al., 2007]... We utilize a customer support dataset obtained from Kaggle2 (https://www.kaggle.com/soaxelbrooke/customer-support-on-twitter)... We utilize the large and well-known large-scale Ubuntu Dialogue Corpus (UDC) [Lowe et al., 2015].
Dataset Splits	Yes	Sci Tail: There are 23K, 1.3K and 2K pairs for training, development and testing respectively. Wiki QA: Wiki QA comprises 5.9K training pairs and 1.1K/1.4K development/testing pairs. Trec QA: Trec QA comprises 53K pairs for training and 1.1K/1.5K pairs for development and testing. Twitter: The dataset is split into 8 : 1 : 1 train-dev-test split. Ubuntu Dialogue Corpus: The training set comprises one million message-response pairs at a 1 : 1 positive-negative ratio. The development and testing sets have a 9 : 1 ratio.
Hardware Specification	No	The paper does not specify any details about the hardware used to run the experiments, such as GPU models, CPU types, or cloud computing instances.
Software Dependencies	No	The paper mentions using the 'Adam optimizer' and 'GloVe 300D embeddings' but does not specify version numbers for any software dependencies or libraries.
Experiment Setup	Yes	We train all models with the Adam optimizer with an learning rate of 3 10 4. The L2 regularization is set to 10 6 and a dropout of d = 0.8 is applied to all layers (except the embedding layer). We initialize word embeddings with Glo VE 300D and keep the embeddings ﬁxed during training. The batch size is set to 64. For Wiki QA, we use the Adadelta optimizer with a learning rate of 0.1 for Wiki QA and 0.2 for Trec QA. Learning rate is decayed at a rate of 0.96 every 10000 steps. The batch size is 100. The Adam optimizer with learning rate of 3 10 4 is used. ... Sequence lengths are padded to a maximum of 50 tokens. The batch size is 256 and L2 regularizartion is 10 6.