reproducibilityindex.ai

An EM Approach to Non-autoregressive Conditional Sequence Generation

Authors: Zhiqing Sun, Yiming Yang

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our method on the task of machine translation. Experimental results on benchmark data sets show that the proposed approach achieves competitive, if not better, performance with existing NAR models and signiﬁcantly reduces the inference latency.
Researcher Affiliation	Academia	1Carnegie Mellon University, Pittsburgh, PA 15213 USA. Correspondence to: Zhiqing Sun <zhiqings@cs.cmu.edu>.
Pseudocode	Yes	Algorithm 1 An EM approach to NAR models
Open Source Code	No	The paper does not include any explicit statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	We use several benchmark tasks to evaluate the effectiveness of the proposed method, including IWSLT143 German-to-English translation (IWSLT14 De-En) and WMT144 English-to-German/German-to-English translation (WMT14 En-De/De-En). ...3https://wit3.fbk.eu/ 4http://statmt.org/wmt14/translation-task. html
Dataset Splits	Yes	For the WMT14 dataset, we use Newstest2014 as test data and Newstest2013 as validation data.
Hardware Specification	Yes	We evaluate the average per-sentence decoding latency on WMT14 En-De test sets with batch size 1 on a single NVIDIA Ge Force RTX 2080 Ti GPU by averaging 5 runs.
Software Dependencies	No	The paper mentions software components like "Adam optimizer" and "label smoothing" but does not specify their version numbers or the versions of any underlying programming frameworks or libraries (e.g., PyTorch, TensorFlow).
Experiment Setup	Yes	We use Adam optimizer (Kingma & Ba, 2014) and employ a label smoothing (Szegedy et al., 2016) of 0.1 in all experiments. The base and large models are trained for 125k steps on 8 TPUv3 nodes in each iteration, while the small models are trained for 20k steps. We use a beam size of 20/5 for the AR model in the M/E-step of our EM training algorithm. The pseudo bounds {ˆbi} is set by early stopping with the accuracy on the validation set.