Contrastive Learning with Adversarial Perturbations for Conditional Text Generation
Authors: Seanie Lee, Dong Bok Lee, Sung Ju Hwang
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically show that our proposed method significantly improves the generalization of the seq2seq on three text generation tasks machine translation, text summarization, and question generation. |
| Researcher Affiliation | Collaboration | Seanie Lee1 , Dong Bok Lee1 , Sung Ju Hwang1,2 KAIST1, AITRICS2, South Korea {lsnfamily02, markhi, sjhwang82}@kaist.ac.kr |
| Pseudocode | No | The paper describes its method using mathematical equations and text but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper states it uses 'the transformers library (Wolf et al., 2019)' which is a third-party tool, but it does not provide its own specific code for the methodology described in the paper. |
| Open Datasets | Yes | For machine translation, we use WMT16 Romanian-English parallel corpus (WMT 16 RO-EN) to train the model. ... For text summarization, we use XSum dataset (Narayan et al., 2018) ... We finetune T5-small model on SQu AD dataset (Rajpurkar et al., 2016). |
| Dataset Splits | Yes | Since the test set of SQu AD is only accessible via leader board, we randomly split the validation set into a validation set and a test set. (Table 4 confirms validation set sizes for all datasets). |
| Hardware Specification | No | The paper states 'We use 8 GPUs for text summarization, and 4 GPUs for machine translation and question generation.' This specifies the quantity but not the specific model (e.g., NVIDIA V100, A100, etc.), making it non-reproducible in terms of hardware. |
| Software Dependencies | No | The paper mentions 'pretrained T5-small model provided from the transformers library (Wolf et al., 2019)' and 'Adafactor optimizer.' However, it does not specify version numbers for the transformers library or Adafactor, which are necessary for reproducible software dependencies. |
| Experiment Setup | Yes | We finetune the pretrained T5-small model for 20 epochs with the batch size of 128 and Adafactor. For contrastive learning, we set the norm of perturbation, η and ϵ as 3.0. ... we set the temperature, τ as 0.1 for all the experiments. At test time, we use beam search of width 4. |