Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
High-Fidelity Simulated Players for Interactive Narrative Planning
Authors: Pengcheng Wang, Jonathan Rowe, Wookhee Min, Bradford Mott, James Lester
IJCAI 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results demonstrate that the proposed models significantly outperform the prior state-of-the-art in generating high-fidelity simulated player models that accurately imitate human players narrative interactions. |
| Researcher Affiliation | Academia | Pengcheng Wang, Jonathan Rowe, Wookhee Min, Bradford Mott, James Lester Department of Computer Science, North Carolina State University, Raleigh, NC 27695, USA EMAIL |
| Pseudocode | No | The paper includes architectural diagrams (Figure 1) and mathematical equations, but no pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link regarding the availability of its source code. |
| Open Datasets | No | The paper describes the "CRYSTAL ISLAND" dataset, stating "The interactive narrative dataset was collected from two human subject studies with 453 players." However, it does not provide any access information (link, DOI, or specific citation for public availability) for this dataset. |
| Dataset Splits | Yes | Five-fold cross-validation is conducted for both prediction accuracy and macro-average F1 score based evaluations. The number of training epochs in each round of cross-validation is determined using a separate validation set, which is later merged back into the training set for the final evaluation. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions "Adam [Kingma and Ba, 2015] is employed for NN optimization." but does not specify version numbers for Adam, any deep learning frameworks (like TensorFlow or PyTorch), or other libraries. |
| Experiment Setup | Yes | A dropout rate of 0.1 is adopted for all the models. The number of training epochs in each round of cross-validation is determined using a separate validation set. All the model size related hyperparameters are tuned using random search. All evaluations are conducted by allowing the derived narrative planners to interact with high-fidelity simulated player models for 5,000 episodes. The e value in e-greedy of AQN is set to decay from 1 to 0.01 linearly in 75% of the training steps. |