reproducibilityindex.ai

Autonomous Capability Assessment of Sequential Decision-Making Systems in Stochastic Settings

Authors: Pulkit Verma, Rushang Karia, Siddharth Srivastava

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We implemented Alg. 1 in Python to evaluate our approach empirically.1 We found that our query synthesis and interactive learning process leads to (i) few shot generalization; (ii) convergence to a sound and complete model; and (iii) much greater sample efficiency and accuracy for learning lifted SDM models with complex capabilities as compared to the baseline.
Researcher Affiliation	Academia	Pulkit Verma, Rushang Karia, and Siddharth Srivastava Autonomous Agents and Intelligent Robots Lab, School of Computing and Augmented Intelligence, Arizona State University, AZ, USA {verma.pulkit, rushang.karia, siddharths}@asu.edu
Pseudocode	Yes	Algorithm 1: QACE Algorithm
Open Source Code	Yes	Source code available at https://github.com/AAIR-lab/QACE
Open Datasets	No	The paper describes creating SDMAs and uses terms like "single training problem" and "test set" composed of problems with varying object counts. It refers to simulators and other research systems used, but it does not provide concrete access information (link, DOI, specific citation with authors/year, or mention of a standard public dataset name with proper attribution) for any publicly available or open datasets used for training.
Dataset Splits	No	The paper mentions using a "single training problem" and a "test set". While it discusses the generation of "test samples" for evaluating variational distance, it does not specify any training/validation/test dataset splits (e.g., percentages, sample counts, or predefined splits) or cross-validation setup.
Hardware Specification	Yes	We ran the experiments on a cluster of Intel Xeon E5-2680 v4 CPUs with Cent OS 7.9 running at 2.4 GHz with a memory limit of 8 GB and a time limit of 4 hours.
Software Dependencies	No	The paper mentions implementation in "Python" and use of "Cent OS 7.9" and "PRP [Muise et al., 2012] as the FOND planner". However, it does not provide specific version numbers for Python or PRP, nor does it list other software dependencies with their versions.
Experiment Setup	Yes	For QACE, we used α = 2d where d is the maximum depth of policies used in queries generated by QACE and η = 5.