NaRuto: Automatically Acquiring Planning Models from Narrative Texts
Authors: Ruiqi Li, Leyang Cui, Songtuan Lin, Patrik Haslum
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present an evaluation of planning domain models derived from narrative texts using our fully automated, unsupervised system, Na Ruto. Evaluation results show that Na Ruto generates domain models of significantly better quality than existing fully automated methods, and even sometimes on par with those created by semi-automated methods, with human assistance. |
| Researcher Affiliation | Collaboration | Ruiqi Li1, Leyang Cui2, Songtuan Lin1, Patrik Haslum1 1Australian National University, 2Tencent AI Lab {ruiqi.li,songtuan.lin,patrik.haslum}@anu.edu.au, leyangcui@tencent.com |
| Pseudocode | No | The paper describes the approach using textual explanations and a flow diagram (Figure 2), but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | 1The source code of Na Ruto system is at https://github.com/ Richie Lee93/Na Ruto. |
| Open Datasets | Yes | For input, we use two short stories that have appeared in work on narrative planning: the Aladdin story by Riedl and Young (2010), and the Old American West story by Ware (2014). |
| Dataset Splits | No | The paper uses the Aladdin story by Riedl and Young (2010) and the Old American West story by Ware (2014) for evaluation, but does not specify explicit train/test/validation splits for these narrative texts in the context of their experiments. While their COMET-BM model is fine-tuned on ATOMIC-2020, specific splits for its training are not provided. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, or cloud configurations) used for running the experiments or training the models. |
| Software Dependencies | No | The paper mentions several software components and systems, such as Allen NLP, Stanford Core NLP, GPT, BART, COMET, and Fast Downward, but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | Based on the probability distributions of each relation s predictions, we set K = 6, θx Need = 0.7, θx Effect = θo Effect = 0.5, and θx React = θo React = 0.2. |