reproducibilityindex.ai

Deconvolutional Latent-Variable Model for Text Sequence Matching

Authors: Dinghan Shen, Yizhe Zhang, Ricardo Henao, Qinliang Su, Lawrence Carin

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments
Researcher Affiliation	Academia	Dinghan Shen, Yizhe Zhang, Ricardo Henao, Qinliang Su, Lawrence Carin Department of Electrical & Computer Engineering, Duke University Durham, NC 27708 {dinghan.shen, yizhe.zhang, ricardo.henao, qinliang.su, lcarin}@duke.edu
Pseudocode	No	The paper describes the model architecture and processes verbally and with diagrams, but no structured pseudocode or algorithm blocks are provided.
Open Source Code	No	The paper does not include an explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets	Yes	Dataset Train Test Classes Vocabulary Quora 384348 10000 2 10k SNLI 549367 9824 3 20k Table 1: Summary of text sequence matching datasets. Further, we apply our models to two standard text sequence matching tasks: Recognizing Textual Entailment (RTE) and paraphrase identiﬁcation, in a semi-supervised setting. The summary statistics of both datasets are presented in Table 1.
Dataset Splits	Yes	Dropout (Srivastava et al. 2014) is employed on both word embedding and latent variable layers, with rates selected from {0.3, 0.5, 0.8} on the validation set. We set the mini-batch size to 32. In semi-supervised sequence matching experiments, L2 norm of the weight vectors is employed as a regularization term in the loss function, and the coefﬁcient of the L2 loss is treated as a hyperparameter and tuned on the validation set.
Hardware Specification	Yes	All experiments are implemented in Tensorﬂow (Abadi et al. 2016), using one NVIDIA Ge Force GTX TITAN X GPU with 12GB memory.
Software Dependencies	No	All experiments are implemented in Tensorﬂow (Abadi et al. 2016)... No specific version number for Tensorﬂow or other software dependencies is provided.
Experiment Setup	Yes	We use 3-layer convolutional neural networks for the inference/encoder network... for all layers we set the ﬁlter window size (W) as 5, with a stride of 2. The feature maps (K) are set as 300, 600, 500, for layers 1 through 3, respectively... The model is trained using Adam (Kingma and Ba 2014) with a learning rate of 3 10 4 for all parameters. Dropout (Srivastava et al. 2014) is employed on both word embedding and latent variable layers, with rates selected from {0.3, 0.5, 0.8} on the validation set. We set the mini-batch size to 32.