Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Qualitative Reasoning about 2D Cardinal Directions using Answer Set Programming

Authors: Yusuf Izmirlioglu, Esra Erdem

JAIR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We design and develop a variety of benchmark instances, and comprehensively evaluate NCDC-ASP from the perspectives of computational efﬁciency. We perform a comprehensive set of experiments, present the results with ﬁgures and tables, and discuss the experimental results (Section 13).
Researcher Affiliation	Academia	Yusuf Izmirlioglu EMAIL New Mexico State University Department of Computer Science Las Cruces NM 88003 USA Esra Erdem EMAIL Sabanci University Faculty of Engineering and Natural Sciences 34956 Istanbul, Turkey
Pseudocode	Yes	In this paper, we present ASP programs in mathematical format instead of the input language of an ASP solver. Therefore, for terms, the lower-case letters denote schematic variables while the upper-case letters denote object constants. For our experiments, we implement these programs in the language of CLINGO , conﬁrming with the ASP-Core-2 standard (Calimeri et al., 2013). We provide these ASP programs in Appendix B.
Open Source Code	Yes	The ASP code, benchmark problem instances and the example scenarios can be found in the online repository https://github.com/yizmirlioglu/n CDC. We have also created a software which takes input from the user and performs the automatic computation of the reasoning tasks using these ASP programs. This software is available at another repository https://github.com/yizmirlioglu/n CDC-ASP-Software.
Open Datasets	Yes	We introduce a comprehensive set of benchmarks for experimental evaluations (Section 12). Some of these benchmarks are carefully handcrafted, avoiding too many redundant CDC contraints, to better analyze the scalability of the ASP-based method for consistency checking, the effect of the degree of incompleteness of the CDC constraints, and the effect of including different types of constraints. Some of these benchmarks are randomly generated. ...The ASP code, benchmark problem instances and the example scenarios can be found in the online repository https://github.com/yizmirlioglu/n CDC.
Dataset Splits	No	The paper discusses generating different types of benchmark instances (e.g., Sparse, Medium, Dense, Complete networks with varying numbers of objects) and random samples (e.g., "50 samples over Reg* and Reg"). However, it does not specify explicit training, validation, or test dataset splits in the context of machine learning models. The term 'splits' is not used in the context of data partitioning for model training or evaluation.
Hardware Specification	Yes	All tests have been performed on a Linux server with 3.3GHz Intel Xeon W-2155 CPU, 32GB memory, single thread and using the ASP solver CLINGO 5.3.0.
Software Dependencies	Yes	All tests have been performed on a Linux server with 3.3GHz Intel Xeon W-2155 CPU, 32GB memory, single thread and using the ASP solver CLINGO 5.3.0. For our experiments, we implement these programs in the language of CLINGO , conﬁrming with the ASP-Core-2 standard (Calimeri et al., 2013).
Experiment Setup	Yes	For each instance, the grid size is precomputed as suggested by Theorem 9. The total computation time reported in the ﬁgures includes the time to calculate the grid size, although this is negligible compared to the timings for consistency checking. In the ﬁgures, the height of a bar in a plot denotes the total computation time (CPU time in seconds). Each bar is splitted by a vertical line; the lower part of the bar (darker color) shows the grounding time, whereas the top part (lighter color) shows the search time.