Contrastive Learning with Adversarial Perturbations for Conditional Text Generation

Authors: Seanie Lee, Dong Bok Lee, Sung Ju Hwang

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically show that our proposed method significantly improves the generalization of the seq2seq on three text generation tasks machine translation, text summarization, and question generation.
Researcher Affiliation Collaboration Seanie Lee1 , Dong Bok Lee1 , Sung Ju Hwang1,2 KAIST1, AITRICS2, South Korea {lsnfamily02, markhi, sjhwang82}@kaist.ac.kr
Pseudocode No The paper describes its method using mathematical equations and text but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper states it uses 'the transformers library (Wolf et al., 2019)' which is a third-party tool, but it does not provide its own specific code for the methodology described in the paper.
Open Datasets Yes For machine translation, we use WMT16 Romanian-English parallel corpus (WMT 16 RO-EN) to train the model. ... For text summarization, we use XSum dataset (Narayan et al., 2018) ... We finetune T5-small model on SQu AD dataset (Rajpurkar et al., 2016).
Dataset Splits Yes Since the test set of SQu AD is only accessible via leader board, we randomly split the validation set into a validation set and a test set. (Table 4 confirms validation set sizes for all datasets).
Hardware Specification No The paper states 'We use 8 GPUs for text summarization, and 4 GPUs for machine translation and question generation.' This specifies the quantity but not the specific model (e.g., NVIDIA V100, A100, etc.), making it non-reproducible in terms of hardware.
Software Dependencies No The paper mentions 'pretrained T5-small model provided from the transformers library (Wolf et al., 2019)' and 'Adafactor optimizer.' However, it does not specify version numbers for the transformers library or Adafactor, which are necessary for reproducible software dependencies.
Experiment Setup Yes We finetune the pretrained T5-small model for 20 epochs with the batch size of 128 and Adafactor. For contrastive learning, we set the norm of perturbation, η and ϵ as 3.0. ... we set the temperature, τ as 0.1 for all the experiments. At test time, we use beam search of width 4.