Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Decision-Theoretic Planning Under Anonymity in Agent Populations

Authors: Ekhlas Sonu, Yingke Chen, Prashant Doshi

JAIR 2017 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The third contribution of this article is a comprehensive empirical evaluation of the methods on three new problems domains policing large protests, controlling trafﬁc congestion at a busy intersection, and improving the AI for the popular Clash of Clans multiplayer game. We demonstrate the feasibility of exact self-interested planning in these large problems, and that our methods for speeding up the planning are effective.
Researcher Affiliation	Academia	Ekhlas Sonu EMAIL Dept of Aeronautics and Astronautics Stanford University Stanford, CA 94305 USA; Yingke Chen EMAIL College of Computer Science Sichuan University Sichuan, China; Prashant Doshi EMAIL THINC Lab, Dept of Computer Science University of Georgia Athens, GA 30602 USA
Pseudocode	Yes	Algorithm 1 Computing Pr(Cν( )\|b0,l(M1,l 1\|s), . . . , b0,l(MN,l 1\|s)) Algorithm 2 Initialize Node Algorithm 3 Update Bounds Algorithm 4 Branch Algorithm 5 Bound
Open Source Code	No	The paper does not contain an explicit statement about releasing source code for the methodology described, nor does it provide a link to a code repository. It mentions third-party software like 'Clash of Clans' but not its own implementation code.
Open Datasets	No	The paper describes three problem domains (policing protests, traffic congestion control, multiplayer video gaming/Clash of Clans) used for empirical evaluation. These are described as simulated environments or a game, rather than external, publicly available datasets with specific access information (links, DOIs, formal citations).
Dataset Splits	No	The paper uses simulated problem domains for its experiments, rather than specific datasets. Therefore, there are no dataset splits (e.g., train/test/validation) described.
Hardware Specification	Yes	All computations are carried out on a RHEL platform with 2.80 GHz processor and 4 GB of main memory.
Software Dependencies	No	The paper mentions a 'RHEL platform' but does not specify versions for any programming languages, libraries, frameworks, or specialized solvers used for implementation.
Experiment Setup	Yes	We set the maximum planning horizon to 5 in all the experiments. The transition, observation and reward functions of the many-agent I-POMDP are all compactly encoded as frame-action hypergraphs; example hypergraphs are shown in Fig. 6. Other agents are modeled as POMDPs and their predicted behavior is obtained using bounded policy iteration (Poupart & Boutilier, 2003). In our second set of experiments, we evaluate on settings involving many more agents. As Table 1 indicates, the traditional I-POMDP does not realistically scale to N > 5 agents.