NAREOR: The Narrative Reordering Problem

Authors: Varun Gangal, Steven Y. Feng, Malihe Alikhani, Teruko Mitamura, Eduard Hovy10645-10653

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present a dataset, NAREORC, with human rewritings of stories within ROCStories in non-linear orders, and conduct a detailed analysis of it. Further, we propose novel task-specific training methods with suitable evaluation metrics. We perform experiments on NAREORC using state-of-the-art models such as BART and T5 and conduct extensive automatic and human evaluations.
Researcher Affiliation Academia 1 Language Technologies Institute, Carnegie Mellon University 2 School of Computing and Information, University of Pittsburgh {vgangal,syfeng,teruko,hovy}@cs.cmu.edu , malihe@pitt.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Code+data at github.com/vgtomahawk/NAREORCam Ready.
Open Datasets Yes We set aside 600, 200, 200 stories from train, dev, and test splits of ROCStories. These act as NAREORC s train Sup, dev Sup, and test Sup splits, for which we collect human references. Remaining stories in each ROCStories split are retained as train Unsup, dev Unsup, and test Unsup of size 95161, 1671, 1671.
Dataset Splits Yes We set aside 600, 200, 200 stories from train, dev, and test splits of ROCStories. These act as NAREORC s train Sup, dev Sup, and test Sup splits, for which we collect human references. Remaining stories in each ROCStories split are retained as train Unsup, dev Unsup, and test Unsup of size 95161, 1671, 1671.
Hardware Specification Yes We used 16GB V100 GPUs for fine-tuning.
Software Dependencies No We use Hugging Face s implementations of their base versions. We fine-tuned the base models from Huggingface.
Experiment Setup Yes We used a learning rate of 1e-5. Models were trained for 10 epochs with a batch size of 8. The initial warm-up steps are 1000. We also used gradient accumulation steps 4. For inference, we generate output up to length 200.