reproducibilityindex.ai

Stick-Breaking Policy Learning in Dec-POMDPs

Authors: Miao Liu, Christopher Amato, Xuejun Liao, Lawrence Carin, Jonathan P. How

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The performance of Dec-SBPR is demonstrated on several benchmark problems, showing that the algorithm scales to large problems while outperforming other state-of-the-art methods.
Researcher Affiliation	Academia	Miao Liu MIT Cambridge, MA miaoliu@mit.edu Christopher Amato University of New Hampshire Durham, NH camato@cs.unh.edu Xuejun Liao, Lawrence Carin Duke University Durham, NC {xjliao,lcarin}@duke.edu Jonathan P. How MIT Cambridge, MA jhow@mit.edu
Pseudocode	Yes	Algorithm 1 Batch VB Inference for Dec-SBPR
Open Source Code	No	The paper does not provide any explicit statement or link to open-source code for the described methodology.
Open Datasets	Yes	Downloaded from http://rbr.cs.umass.edu/camato/decpomdp/ download.html
Dataset Splits	No	The paper mentions using 'K = 300 episodes' for learning and '100 test episodes' for evaluation, but it does not specify explicit train/validation/test splits by percentages or counts, nor does it explicitly mention a validation set.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9).
Experiment Setup	Yes	For Dec-SBPR, the hyperparameters in (8) are set to c = 0.1 and d = 10 6 to promote sparse usage of FSC nodes. The policies are initialized as FSCs converted from the episodes with the highest rewards using a method similar to [Amato and Zilberstein, 2009].