Sequence Generation with Optimal-Transport-Enhanced Reinforcement Learning

Authors: Liqun Chen, Ke Bai, Chenyang Tao, Yizhe Zhang, Guoyin Wang, Wenlin Wang, Ricardo Henao, Lawrence Carin7512-7520

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To validate the effectiveness of the proposed solution, we perform a comprehensive evaluation covering a wide variety of NLP tasks: machine translation, abstractive text summarization and image caption, with consistent improvements over competing solutions. Experiments Datasets and setup We consider three tasks to evaluate our model: i) Machine translation:...
Researcher Affiliation Academia Liqun Chen, Ke Bai, Chenyang Tao, Yizhe Zhang, Guoyin Wang, Wenlin Wang, Ricardo Henao, Lawrence Carin Duke University
Pseudocode Yes Algorithm 1 IPOT algorithm. Algorithm 2 OTRL for Seq2Seq learning.
Open Source Code No The paper does not provide an explicit statement or link to open-source code for the specific methodology (OTRL) described by the authors. It mentions using existing codebases like Texar and a public PyTorch implementation for baselines.
Open Datasets Yes i) Machine translation: the commonly used English-German translation dataset and English-Vietnamese translation dataset, IWSLT 2014 (Cettolo et al., 2014)... ii) Abstractive summarization: two different datasets are employed for the summarization task, English gigawords (Graff et al., 2003) and CNN/Dailymail Hermann et al. (2015); Nallapati et al. (2016)... iii) Image captioning: we also consider an image captioning task with the COCO dataset (Lin et al., 2014)...
Dataset Splits Yes The ENDE dataset has 146K/7K/7K paired sentences for training/validation/testing, respectively.
Hardware Specification Yes All experiments are performed on one NVIDIA TITAN X GPU.
Software Dependencies No The paper mentions using Texar, PyTorch, and TensorFlow, but does not provide specific version numbers for these software dependencies or any other key libraries used in the experiments.
Experiment Setup Yes We use Adam optimization with learning rate 0.001 and batch size = 64 for training. For hyper-parameters λ1, λ2, λ3, we use parameter gridsearch to find the best setup for different tasks. In machine translation and abstractive summarization, We set λ1 = 0.9, λ2 = 0.1, λ3 = 0 as our initialization. Then we gradually decrease λ1 to 0.7, and increase λ3 to 0.2. This annealing process starts after the 7-th epoch.