reproducibilityindex.ai

Posterior Attention Models for Sequence to Sequence Learning

Authors: Shiv Shankar, Sunita Sarawagi

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically on ﬁve translation and two morphological inﬂection tasks the proposed posterior attention models yield better BLEU score and alignment accuracy than existing attention models.
Researcher Affiliation	Collaboration	Shiv Shankar University of Massachusetts Amherst sshankar@umass.edu Sunita Sarawagi IIT Bombay sunita@iitb.ac.in ... Acknowledgements We thank NVIDIA Corporation and Flipkart for supporting this research.
Pseudocode	No	The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement about releasing source code for its methodology or a link to a code repository. It mentions using 'author's code' for a baseline model but not for their own.
Open Datasets	Yes	We experiment on ﬁve language pairs from three datasets: IWSLT15 English Vietnamese, IWSLT14 German English Cettolo et al. (2015); and WAT17 Japanese English Nakazawa et al. (2016). We also used the RWTH German-English dataset which provides alignment information manually tagged by experts.
Dataset Splits	No	The paper mentions training models and evaluating on beam sizes but does not specify the training, validation, or test dataset splits (e.g., percentages or counts) for reproduction. It uses well-known datasets that often have standard splits, but these are not explicitly stated within the paper.
Hardware Specification	No	The paper thanks 'NVIDIA Corporation' for support, implying the use of NVIDIA GPUs, but it does not specify any particular GPU model (e.g., 'NVIDIA A100', 'Tesla V100'), CPU, or other hardware specifications used for running the experiments.
Software Dependencies	No	The paper mentions using LSTM units, SGD optimizer, and Adam optimizer but does not provide specific version numbers for these or any other software libraries or frameworks (e.g., PyTorch, TensorFlow, Python version) that would be needed to reproduce the experiments.
Experiment Setup	Yes	We use a 2 layer bi-directional encoder and 2 layer decoder with 512 LSTM units and 0.2 dropout with vanilla SGD optimizer. We train a one layer encoder and decoder with 128 hidden LSTM units each with a dropout rate of 0.2 using Adam and measure 0/1 accuracy.