Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Learning Scripts as Hidden Markov Models
Authors: John Orr, Prasad Tadepalli, Janardhan Doppa, Xiaoli Fern, Thomas Dietterich
AAAI 2014 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We develop an algorithm for structure and parameter learning based on Expectation Maximization and evaluate it on a number of natural datasets. The results show that our algorithm is superior to several informed baselines for predicting missing events in partial observation sequences. |
| Researcher Affiliation | Academia | J. Walker Orr, Prasad Tadepalli, Janardhan Rao Doppa, Xiaoli Fern, Thomas G. Dietterich EMAIL School of EECS, Oregon State Univserity, Corvallis OR 97331 |
| Pseudocode | Yes | Algorithm 1 procedure LEARN(Model M, Data D, Changes S); Algorithm 2 Forward-Backward algorithm to delete an edge and re-distribute the expected counts. |
| Open Source Code | No | The paper does not provide any specific links or statements indicating that its source code is publicly available. |
| Open Datasets | Yes | The Open Minds Indoor Common Sense (OMICS) corpus was developed by the Honda Research Institute and is based upon the Open Mind Common Sense project (Gupta and Kochenderfer 2004). |
| Dataset Splits | No | The paper mentions a test split ("forty percent of the narratives were withheld for testing") but does not specify a separate validation split for hyperparameter tuning or model selection. |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using Wordnet for semantic similarity but does not specify any software names with version numbers for reproducibility. |
| Experiment Setup | Yes | The paper mentions specific experimental parameters such as "Batch Size r 2 5 10" in Table 1, indicating that `r` is a varying hyperparameter. It also mentions "adding a pseudocount of 1" and a z-test "threshold of 0.01" for constraint learning. |