reproducibilityindex.ai

Semi-Supervised Text Simplification with Back-Translation and Asymmetric Denoising Autoencoders

Authors: Yanbin Zhao, Lu Chen, Zhi Chen, Kai Yu9668-9675

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Automatic and human evaluations show that our unsupervised model outperforms the previous systems, and with limited supervision, our model can perform competitively with multiple state-of-the-art simpliﬁcation systems.
Researcher Affiliation	Academia	Yanbin Zhao, Lu Chen, Zhi Chen, Kai Yu Mo E Key Lab of Artiﬁcial Intelligence Speech Lab, Department of Computer Science and Engineering Shanghai Jiao Tong University, Shanghai, China {zhaoyb, chenlusz, zhenchi713, kai.yu}@sjtu.edu.cn
Pseudocode	Yes	Algorithm 1: Our Simpliﬁcation System
Open Source Code	No	The paper does not provide any explicit statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets	Yes	We use the UNTS dataset (Surya et al. 2018) to train our unsupervised-model. ... For semi-supervised training and evaluation, we also use two parallel datasets: Wiki Large (Zhang and Lapata 2017) and Newsela dataset (Xu, Callison-Burch, and Napoles 2015).
Dataset Splits	Yes	Wiki Large comprise 359 test sentences, 2000 development sentences and 300k training sentences. Each source sentences in test set has 8 simpliﬁed references. Newsela... The ﬁrst 1,070 articles are used for training, next 30 articles for development and others for testing.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU models, CPU types, or other detailed computer specifications used for running experiments.
Software Dependencies	No	The paper mentions software components like Transformer, Adam optimizer, byte-pair encoding, Fast Text, and LSTM language models, but does not provide specific version numbers for any of them.
Experiment Setup	Yes	Our model is built upon Transformer (Vaswani et al. 2017). Both encoder and decoders have 3 layers with 8 multiattention heads... The sub-word embeddings are 512dimensional vectors... In the training process, we use Adam optimizer (Kingma and Ba 2015); the ﬁrst momentum was set to 0.5 and batch size to 16. For reinforcement training, we dynamically adjust the balance parameter γ. At the start of the training process, γ is set to zero... As training progresses, γ is gradually increased and ﬁnally con-verge to 0.9. We use the sigmoid function to perform this process. ...We pre-train the asymmetric denoising autoencoders for 200,000 steps with a learning rate of 1e-4, After that, we add back-translation training with a learning rate of 5e-5.