reproducibilityindex.ai

NaRuto: Automatically Acquiring Planning Models from Narrative Texts

Authors: Ruiqi Li, Leyang Cui, Songtuan Lin, Patrik Haslum

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present an evaluation of planning domain models derived from narrative texts using our fully automated, unsupervised system, Na Ruto. Evaluation results show that Na Ruto generates domain models of significantly better quality than existing fully automated methods, and even sometimes on par with those created by semi-automated methods, with human assistance.
Researcher Affiliation	Collaboration	Ruiqi Li1, Leyang Cui2, Songtuan Lin1, Patrik Haslum1 1Australian National University, 2Tencent AI Lab {ruiqi.li,songtuan.lin,patrik.haslum}@anu.edu.au, leyangcui@tencent.com
Pseudocode	No	The paper describes the approach using textual explanations and a flow diagram (Figure 2), but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	1The source code of Na Ruto system is at https://github.com/ Richie Lee93/Na Ruto.
Open Datasets	Yes	For input, we use two short stories that have appeared in work on narrative planning: the Aladdin story by Riedl and Young (2010), and the Old American West story by Ware (2014).
Dataset Splits	No	The paper uses the Aladdin story by Riedl and Young (2010) and the Old American West story by Ware (2014) for evaluation, but does not specify explicit train/test/validation splits for these narrative texts in the context of their experiments. While their COMET-BM model is fine-tuned on ATOMIC-2020, specific splits for its training are not provided.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, or cloud configurations) used for running the experiments or training the models.
Software Dependencies	No	The paper mentions several software components and systems, such as Allen NLP, Stanford Core NLP, GPT, BART, COMET, and Fast Downward, but does not provide specific version numbers for these dependencies.
Experiment Setup	Yes	Based on the probability distributions of each relation s predictions, we set K = 6, θx Need = 0.7, θx Effect = θo Effect = 0.5, and θx React = θo React = 0.2.