Non-Monotonic Sequential Text Generation

Authors: Sean Welleck, Kianté Brantley, Hal Daumé Iii, Kyunghyun Cho

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that using the proposed method, it is possible to learn policies which generate text without pre-specifying a generation order, while achieving competitive performance with conventional left-to-right generation.
Researcher Affiliation Collaboration 1New York University 2University of Maryland, College Park 3Microsoft Research 4Facebook AI Research 5CIFAR Azrieli Global Scholar.
Pseudocode No No explicit pseudocode blocks or sections labeled 'Algorithm' were found.
Open Source Code Yes Code and trained models available at https://github. com/wellecks/nonmonotonic_text.
Open Datasets Yes We use a dataset derived from the Persona-Chat (Zhang et al., 2018) dialogue dataset, which consists of multi-turn dialogues between two agents.
Dataset Splits Yes We derive the examples from the same train, validation, and test splits as Persona-Chat, resulting in 133,176 train, 16,181 validation, and 15,608 test examples.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments are mentioned in the paper.
Software Dependencies No The paper mentions architectural components like LSTM and Transformer, and tools like GloVe, but does not provide specific version numbers for any software dependencies or libraries (e.g., 'PyTorch 1.x', 'TensorFlow 2.x').
Experiment Setup Yes We use a uni-directional LSTM that has 2 layers of 1024 LSTM units. As the policy (decoder) we use a flat LSTM with 2 layers of 1024 LSTM units. We use a Transformer policy, following the architecture of (Vaswani et al., 2017)... For the end prediction threshold τ we use 0.5, and also report a variant (+ end tuning) in which τ is tuned based on validation BLEU (τ = 0.67).