reproducibilityindex.ai

Insertion Transformer: Flexible Sequence Generation via Insertion Operations

Authors: Mitchell Stern, William Chan, Jamie Kiros, Jakob Uszkoreit

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our approach by analyzing its performance on the WMT 2014 English German machine translation task under various settings for training and decoding. We ﬁnd that the Insertion Transformer outperforms many prior non-autoregressive approaches to translation at comparable or better levels of parallelism, and successfully recovers the performance of the original Transformer while requiring only logarithmically many iterations during decoding. In this section, we explore the efﬁcacy of our approach on a machine translation task, analyzing its performance under different training conditions, architectural choices, and decoding procedures. We experiment on the WMT 2014 English-German translation dataset, using newstest2013 for development and newstest2014 for testing, respectively.
Researcher Affiliation	Collaboration	1Google Brain, Mountain View, Toronto, Berlin 2University of California, Berkeley.
Pseudocode	No	The paper describes the model architecture and processes verbally but does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions using TensorFlow and Tensor2Tensor framework but does not provide any concrete access to source code for the Insertion Transformer itself, nor does it state that the code is being released.
Open Datasets	Yes	We experiment on the WMT 2014 English-German translation dataset, using newstest2013 for development and newstest2014 for testing, respectively.
Dataset Splits	Yes	We experiment on the WMT 2014 English-German translation dataset, using newstest2013 for development and newstest2014 for testing, respectively.
Hardware Specification	Yes	All our models are trained for 1,000,000 steps on eight P100 GPUs.
Software Dependencies	No	All our experiments are implemented in Tensor Flow (Abadi et al., 2015) using the Tensor2Tensor framework (Vaswani et al., 2018). (The paper mentions software names but does not provide specific version numbers for them).
Experiment Setup	Yes	We use the default transformer base hyperparameter set reported by Vaswani et al. (2018) for all hyperparameters not speciﬁc to our model. We perform no additional hyperparameter tuning. All our models are trained for 1,000,000 steps on eight P100 GPUs.