Automatically Paraphrasing via Sentence Reconstruction and Round-trip Translation
Authors: Zilu Guo, Zhongqiang Huang, Kenny Q. Zhu, Guandan Chen, Kaibo Zhang, Boxing Chen, Fei Huang
IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate this framework on Quora, Wiki Answers, MSCOCO and Twitter, and show its advantage over previous state-of-the-art unsupervised methods and distantly-supervised methods by significant margins on all datasets. |
| Researcher Affiliation | Collaboration | Zilu Guo1 , Zhongqiang Huang2 , Kenny Q. Zhu1 , Guandan Chen2 , Kaibo Zhang2 , Boxing Chen2 and Fei Huang2 1Shanghai Jiao Tong University 2Alibaba Damo Academy |
| Pseudocode | No | The paper describes the model architecture and equations but does not present a formal pseudocode or algorithm block. |
| Open Source Code | Yes | Code is available: https://github.com/Karlguo/paraphrase |
| Open Datasets | Yes | We evaluate our framework on four different datasets, namely Quora, Wiki Answers, MSCOCO, and Twitter. Following Liu et al. [2019], we randomly choose 20K parallel paraphrase pairs as the test set and 3K parallel paraphrase pairs as the validation set for Quora, Wiki Answers, and MSCOCO. |
| Dataset Splits | Yes | Following Liu et al. [2019], we randomly choose 20K parallel paraphrase pairs as the test set and 3K parallel paraphrase pairs as the validation set for Quora, Wiki Answers, and MSCOCO. |
| Hardware Specification | Yes | For the translation models in round-trip translation, we train them with the WMT174 zh-en dataset [Ziemski et al., 2016] with a standard transformer for 3 days on two GTX2080 GPUs. ... The training lasts 3 hours on a single GTX-2080 GPU. |
| Software Dependencies | No | The paper mentions using a 'standard transformer' and 'byte-pair encoding (BPE)' but does not list specific software dependencies with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x, Python version). |
| Experiment Setup | Yes | For the domain-specific set2seq models, we use a 2-layer transformer with 300 embedding size, 256 units, 1024 feed-forward dimensions for all layers to train them. ... For the hyper-parameter λ, when it is close to 0, the result is similar to the round-trip translation. When λ is between 0.4-0.8, the result is stable, and i BLEU is above 14. As λ goes to infinity, the result is slowly approaching that of set2seq. We set the value to 0.5 for all datasets after experimenting with difference choices. |