Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Automatically Paraphrasing via Sentence Reconstruction and Round-trip Translation
Authors: Zilu Guo, Zhongqiang Huang, Kenny Q. Zhu, Guandan Chen, Kaibo Zhang, Boxing Chen, Fei Huang
IJCAI 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate this framework on Quora, Wiki Answers, MSCOCO and Twitter, and show its advantage over previous state-of-the-art unsupervised methods and distantly-supervised methods by significant margins on all datasets. |
| Researcher Affiliation | Collaboration | Zilu Guo1 , Zhongqiang Huang2 , Kenny Q. Zhu1 , Guandan Chen2 , Kaibo Zhang2 , Boxing Chen2 and Fei Huang2 1Shanghai Jiao Tong University 2Alibaba Damo Academy |
| Pseudocode | No | The paper describes the model architecture and equations but does not present a formal pseudocode or algorithm block. |
| Open Source Code | Yes | Code is available: https://github.com/Karlguo/paraphrase |
| Open Datasets | Yes | We evaluate our framework on four different datasets, namely Quora, Wiki Answers, MSCOCO, and Twitter. Following Liu et al. [2019], we randomly choose 20K parallel paraphrase pairs as the test set and 3K parallel paraphrase pairs as the validation set for Quora, Wiki Answers, and MSCOCO. |
| Dataset Splits | Yes | Following Liu et al. [2019], we randomly choose 20K parallel paraphrase pairs as the test set and 3K parallel paraphrase pairs as the validation set for Quora, Wiki Answers, and MSCOCO. |
| Hardware Specification | Yes | For the translation models in round-trip translation, we train them with the WMT174 zh-en dataset [Ziemski et al., 2016] with a standard transformer for 3 days on two GTX2080 GPUs. ... The training lasts 3 hours on a single GTX-2080 GPU. |
| Software Dependencies | No | The paper mentions using a 'standard transformer' and 'byte-pair encoding (BPE)' but does not list specific software dependencies with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x, Python version). |
| Experiment Setup | Yes | For the domain-specific set2seq models, we use a 2-layer transformer with 300 embedding size, 256 units, 1024 feed-forward dimensions for all layers to train them. ... For the hyper-parameter λ, when it is close to 0, the result is similar to the round-trip translation. When λ is between 0.4-0.8, the result is stable, and i BLEU is above 14. As λ goes to infinity, the result is slowly approaching that of set2seq. We set the value to 0.5 for all datasets after experimenting with difference choices. |