Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Integrated Common Sense Learning and Planning in POMDPs

Authors: Brendan Juba

JMLR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We formulate a new variant of the problem of planning in an unknown environment, for which we can provide algorithms with reasonable theoretical guarantees in spite of large state spaces and time horizons, partial observability, and complex dynamics. In this variant, an agent is given a collection of example traces produced by a reference policy, which may, for example, capture the agent s past behavior. We describe an eﬃcient algorithm that uses such common sense knowledge reﬂected in the example traces to construct decision tree policies for goal-oriented factored POMDPs. More precisely, our algorithm (provably) succeeds at ﬁnding a policy for a given input goal when (1) there is a CNF that is almost always observed satisﬁed on the traces of the POMDP, capturing a suﬃcient approximation of its dynamics and (2) for a decision tree policy of bounded complexity, there exist small-space resolution proofs that the goal is achieved on each branch using the aforementioned CNF capturing the common sense rules.
Researcher Affiliation	Academia	Brendan Juba EMAIL Washington University 1 Brookings Dr. St. Louis, MO 63130 USA
Pseudocode	Yes	Algorithm 1 Find-DT() PAC-refute(ϕ, R) decides if ϕ ψ is at most ϵ-valid given some testable ψ over weighted partial examples in R (taken out of m) (cf. Theorem 8). Input: DNF goal G, CNF KB Φ, Strahler number s, policy branch Π taking t 1 actions, horizon bound T, sample of traces with stepwise weights R Output: A Strahler-s decision tree or if Π takes more than T actions then return end if Rt {( t 1 i=1(o(i), a(i)) o(t), Πt 1 i=1wi) : ((o(i), a(i), wi)T i=1, o(T+1)) R, Πt 1 i=1wi 4 µ} if PAC-Refute( G(t) Φ Π taken, Rt) then return Π. end if if s > 1 then for all ℓs.t. ℓ(t) and ℓ(t) not in Π do Πℓ Π with a ﬁnal test for ℓ(t)
Open Source Code	No	The paper does not provide concrete access to source code. It discusses algorithms and theoretical findings but does not mention any repository links, explicit code release statements, or code in supplementary materials for its methodology.
Open Datasets	No	The paper uses an example based on the 'Gripper problem from the ﬁrst International Planning Competition (IPC 1998)' to illustrate concepts, but does not use it as an experimental dataset with access information. It does not provide concrete access information (link, DOI, repository, or citation) for any other publicly available or open datasets for empirical evaluation.
Dataset Splits	No	The paper does not describe empirical experiments with datasets; therefore, it does not provide any information regarding training/test/validation dataset splits.
Hardware Specification	No	The paper focuses on theoretical algorithms and their complexity analysis. It does not describe any specific hardware (like GPU models, CPU types, or cloud resources) used for running experiments.
Software Dependencies	No	The paper is theoretical and focuses on algorithm design and proofs, not on implementation details. It does not mention any specific software dependencies with version numbers (e.g., libraries, frameworks, or solvers) used for experimental replication.
Experiment Setup	No	The paper is theoretical and describes algorithms and proofs. It does not include details about an experimental setup, such as hyperparameters, training configurations, or system-level settings, as no empirical experiments are presented.