reproducibilityindex.ai

Target-Side Input Augmentation for Sequence to Sequence Generation

Authors: Shufang Xie, Ang Lv, Yingce Xia, Lijun Wu, Tao Qin, Tie-Yan Liu, Rui Yan

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct comprehensive experiments on various sequence generation tasks, including dialog generation, machine translation, and abstractive summarization.
Researcher Affiliation	Collaboration	Shufang Xie1 , Ang Lv1 , Yingce Xia2, Lijun Wu2, Tao Qin2, Tie-Yan Liu2, Rui Yan1 1Gaoling School of Artiﬁcial Intelligence, Remin University of China 2Microsoft Research Asia 1shufangxie@ruc.edu.cn, lvangupup@gmail.com, ruiyan@ruc.edu.cn 2{yingce.xia, lijuwu, taoqin, tyliu}@microsoft.com
Pseudocode	No	The paper describes the proposed algorithm in prose and mathematical equations but does not include any explicit pseudocode or algorithm blocks.
Open Source Code	Yes	The code is available at https://github.com/TARGET-SIDE-DATA-AUG/ TSDASG.
Open Datasets	Yes	We conduct experiments on two commonly used dialog generation data sets: the Daily Dialog (Li et al., 2017) for single-turn dialog generation and Persona-Chat (Zhang et al., 2018) for multi-turn dialog generation. ... the IWSLT 14 English (EN) German (DE) data set (Cettolo et al., 2014) ... the WMT 14 EN DE dataset (Bojar et al., 2014) ... the CNN/DM (Hermann et al., 2015) news summary dataset.
Dataset Splits	Yes	We follow the script of Luo et al. (2018) to pre-process the Daily Dialog data, where the dialog is represented as request-response pairs. Then the ﬁrst 80% pairs are used for training, the next 10% for validation, and the last 10% for test.
Hardware Specification	No	The paper mentions using a "Transformer network architecture" but does not specify any hardware components like GPU models, CPU types, or cloud computing instances used for experiments.
Software Dependencies	No	Our implementation is based on the Fair Seq framework (Ott et al., 2019). We use Transformer (Vaswani et al., 2017) network architecture in all experiments... We use Adam optimizer (Kingma & Ba, 2015)... We compute the BLEU score by the Moses script (Koehn et al., 2007)... We use the files2rouge tool to evaluate... No specific version numbers are provided for these software dependencies, only the names and relevant citations.
Experiment Setup	Yes	We use Transformer (Vaswani et al., 2017) network architecture in all experiments with different model sizes, which are adjusted according to the data size. During training, we use Adam optimizer (Kingma & Ba, 2015) with Adam β = (0.9, 0.98) and invert sqrt learning rate scheduler. Meanwhile, we used label smoothing of value 0.1... We use transformer small conﬁguration for Daily Dialog dataset and transformer base for Persona-Chat dataset, where both the encoder and the decoder consist of six layers. The (Embed Dim, FFN Embed Dim) of those conﬁgurations are (512, 1024) and (512, 2048), respectively. Our results are generated by beam search with beam size 5. We compute the BLEU score by the Moses script (Koehn et al., 2007) with the same tokenizer used by previous works.