Unsupervised Editing for Counterfactual Stories
Authors: Jiangjie Chen, Chun Gan, Sijie Cheng, Hao Zhou, Yanghua Xiao, Lei Li10473-10481
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate EDUCAT on a public counterfactual story rewriting benchmark. Experiments show that EDUCAT achieves the best trade-off over unsupervised SOTA methods according to both automatic and human evaluation. |
| Researcher Affiliation | Collaboration | Jiangjie Chen1,2*, Chun Gan3*, Sijie Cheng1, Hao Zhou2 , Yanghua Xiao1,5 , Lei Li4 1Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University 2Byte Dance AI Lab 3JD.com 4University of California, Santa Barbara 5Fudan-Aishu Cognitive Intelligence Joint Research Center |
| Pseudocode | No | The paper describes the Metropolis-Hasting sampling algorithm and provides mathematical formulas but does not include a distinct pseudocode block or algorithm box. |
| Open Source Code | Yes | The resources of EDUCAT are available at: https://github.com/jiangjiechen/EDUCAT. |
| Open Datasets | Yes | We experiment EDUCAT on TIMETRAVEL (Qin et al. 2019), a standard counterfactual story rewriting dataset. TIMETRAVEL is built on ROCStories (Mostafazadeh et al. 2016) |
| Dataset Splits | Yes | Table 1: Statistics of TIMETRAVEL dataset. Train Dev Test # counterfactual context (x') 96,867 1,871 1,871 # edited endings (y') 16,752 5,613 7,484 |
| Hardware Specification | No | The paper states it uses pre-trained models like GPT-2 and RoBERTa, but it does not provide specific details about the hardware (e.g., CPU, GPU models, or memory) used for running the experiments or for inference. |
| Software Dependencies | No | The paper mentions using 'implementations of Huggingface (Wolf et al. 2020)' and models like 'GPT-2' and 'RoBERTabase', but it does not specify version numbers for these software components or other ancillary libraries. |
| Experiment Setup | Yes | T is a temperature controlled by a cooling schedule (Andrieu et al. 2003) (T = 0.95 t / 5 in our implementation.) We keep the first 100 tokens MLM predicts as candidates. In the experiments, we run EDUCAT and its variants for 100 steps. |