Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Analyzing Intentional Behavior in Autonomous Agents under Uncertainty
Authors: Filip Cano Córdoba, Samuel Judson, Timos Antonopoulos, Katrine Bjørner, Nicholas Shoemaker, Scott J. Shapiro, Ruzica Piskac, Bettina Könighofer
IJCAI 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In a case study, we show how our method can distinguish between intentional and accidental traffic collisions. ... In this section, we showcase our method on a traffic-related scenario related to Examples 1-2, and that is illustrated in Figure 2. ... All experiments were executed on an Intel Core i5 CPU with 16GB of RAM running Ubuntu 20.04. We use TEMPEST [Pranger et al., 2021] as our model checking engine. 6.1 Model of Environment 6.2 Analysis of a Trace 6.3 Comparative Analysis of Several Agents |
| Researcher Affiliation | Academia | 1Graz University of Technology 2Yale University 3 New York University EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes the methodology in prose and includes a high-level flowchart (Figure 1), but it does not provide structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Find code and experimental details in the accompanying repository https://github.com/filipcano/intentional-autonomous-agents. |
| Open Datasets | No | The paper describes modeling a custom environment and scenario (Section 6.1) rather than using a pre-existing, publicly available dataset with concrete access information. No dataset is mentioned for public access. |
| Dataset Splits | No | The paper analyzes a specific scenario and generates counterfactual scenarios for analysis, but it does not describe a traditional machine learning experimental setup with training, validation, and test dataset splits with specified percentages or sample counts. |
| Hardware Specification | Yes | All experiments were executed on an Intel Core i5 CPU with 16GB of RAM running Ubuntu 20.04. |
| Software Dependencies | No | The paper mentions using "TEMPEST [Pranger et al., 2021] as our model checking engine" and "Ubuntu 20.04", but it does not provide specific version numbers for software libraries, frameworks, or solvers beyond the operating system. |
| Experiment Setup | Yes | As thresholds to evaluate evidence of intention, we use δL ρ = 0.25, δU ρ = 0.75 and δσ = 0.5. ... We change the following variables: Slipperiness range... Slipperiness factor... Hesitancy factor... Visibility... The variables and the ranges considered for generating counterfactuals are summarized in Table 1. |