reproducibilityindex.ai

Novelty Controlled Paraphrase Generation with Retrieval Augmented Conditional Prompt Tuning

Authors: Jishnu Ray Chowdhury, Yong Zhuang, Shuyi Wang10535-10544

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	By conducting extensive experiments on four datasets, we demonstrate the effectiveness of the proposed approaches for retaining the semantic content of the original text while inducing lexical novelty in the generation.
Researcher Affiliation	Collaboration	1 University of Illinois, at Chicago 2 Bloomberg jraych2@uic.edu, yzhuang52@bloomberg.net, swang1072@bloomberg.net
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions using 'the ofﬁcial code for Lo RA.7' which refers to a third-party library, not the authors' own implementation code. There is no explicit statement or link provided for the source code of their proposed methods.
Open Datasets	Yes	Quora Question Pairs 50K split (QQP 50K)2: Quora Question Paris (QQP) is a paraphrase detection dataset. We only use the true paraphrase pairs. We use the 50K dataset split as used in Gupta et al. (2018).3; Microsoft Research Paraphrase Corpus (MSRPC): MSRPC (Dolan, Quirk, and Brockett 2004) is another paraphrase detection corpus.; Para SCI-ACL: Para SCI-ACL (Dong, Wan, and Cao 2021) is a paraphrase generation dataset in the scientiﬁc domain. We use the ofﬁcial split.5
Dataset Splits	Yes	Details of dataset split sizes are presented in Table 3. Dataset Name Training Validation Test QQP 50K 46,000 4,000 4,000
Hardware Specification	Yes	The models are trained and tuned on single Tesla V100 32GB GPUs.
Software Dependencies	No	The paper mentions specific software components like 'Adam W', 'Transformers library (Wolf et al. 2020)', 'sentence-transformers', and 'Lo RA' but does not provide specific version numbers for these.
Experiment Setup	Yes	We tune the hyperparameters on QQP 50K with GPT2 medium for all the approaches. We search the learning rate within {0.1, 0.01, 1e 3, 1e 4, 5e 5}. For adapter tuning, we search the adapter bottleneck hidden state dimension within {128, 256, 512}. For Lo RA, LPT, RAPT, and NC-RAPT (all approached involving Lo RA), we ﬁx r (matrix rank) as 8. We also use a weight decay of 0.01 for Lo RA-based methods. We set the inﬁx length for all prompt tuning methods to 8. We search the preﬁx length of prompt tuning random, preﬁx tuning, and preﬁx-layer tuning within {8, 64, 256}. In all cases, we use Adam W (Loshchilov and Hutter 2019) as the optimizer. We also use a linear schedule with warmup for 100 steps, a gradient norm clipping with a maximum of 1, a batch size of 32, and a maximum decoding length of n+100. We set the early stopping patience as 3. Model selection during training is done based on validation loss.