Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Optimal Status Enforcement in Abstract Argumentation
Authors: Andreas Niskanen, Johannes P. Wallner, Matti Järvisalo
IJCAI 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We have implemented the Max SAT encodings and the CEGAR-procedures, obtaining the first system for optimal status enforcement. Here we present an overview of an empirical evaluation of the system. We generated benchmark instances following essentially a standard model for random directed graphs. Mean runtimes with timeouts included as 900s are shown in Figure 2 for the NP problems of credulous status enforcement with |N| under admissible semantics (left) and for the P2 skeptical and credulous status enforcement problems under stable semantics (right). |
| Researcher Affiliation | Academia | Andreas Niskanen and Johannes P. Wallner and Matti Järvisalo Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Finland |
| Pseudocode | Yes | Algorithm 1 CEGAR-based status enforcement for AF F = (A, R), P, N A, σ 2 {adm, stb}, M 2 {cred, skept} |
| Open Source Code | Yes | Our status enforcement system implementation together with benchmarks used in this paper, as well as full formal proofs of our complexity results, are available via http://www.cs.helsinki.fi/group/coreo/pakota/. |
| Open Datasets | No | No specific publicly available dataset with concrete access information (link, DOI, formal citation with authors/year) is provided. The paper states: 'We generated benchmark instances following essentially a standard model for random directed graphs. For each |A| 2 {20, 40, . . . , 200} and p 2 {0.05, 0.1, . . . , 0.35}2, we generated ten random AFs with |A| arguments by including individual attacks with probability p.' While their generated benchmarks are available via a general project URL mentioned earlier, this question refers to a dataset being publicly available with a formal citation/access, not the generated instances of their experiment. |
| Dataset Splits | No | No explicit details about training, validation, or test splits were provided. The paper describes how benchmark instances were generated: 'For each |A| 2 {20, 40, . . . , 200} and p 2 {0.05, 0.1, . . . , 0.35}2, we generated ten random AFs with |A| arguments by including individual attacks with probability p. For each AF, we randomly picked 5 arguments, of which we enforced |P| 2 {1, 2, . . . , 5} positively, and finally picked |N| 2 {0, 1, 2, 5} arguments from the set A \ P to be enforced negatively.' |
| Hardware Specification | Yes | We used Open WBO [Martins et al., 2014] as the Max SAT solver, and ran the experiments on 2.83-GHz Intel Xeon E5440 4-core nodes with 32-GB RAM and Debian GNU/Linux 8 under 900-second per-instance timeout. |
| Software Dependencies | No | The paper states: 'We used Open WBO [Martins et al., 2014] as the Max SAT solver, and ran the experiments on... Debian GNU/Linux 8'. While Open WBO is named, its version number is not provided, and Debian GNU/Linux is an operating system, not an ancillary software dependency with a specific version number to ensure reproducibility of the experimental software stack. |
| Experiment Setup | Yes | For each |A| 2 {20, 40, . . . , 200} and p 2 {0.05, 0.1, . . . , 0.35}2, we generated ten random AFs with |A| arguments by including individual attacks with probability p. For each AF, we randomly picked 5 arguments, of which we enforced |P| 2 {1, 2, . . . , 5} positively, and finally picked |N| 2 {0, 1, 2, 5} arguments from the set A \ P to be enforced negatively. We used Open WBO [Martins et al., 2014] as the Max SAT solver, and ran the experiments on 2.83-GHz Intel Xeon E5440 4-core nodes with 32-GB RAM and Debian GNU/Linux 8 under 900-second per-instance timeout. |