reproducibilityindex.ai

Periodic agent-state based Q-learning for POMDPs

Authors: Amit Sinha, Matthieu Geist, Aditya Mahajan

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we present a numerical experiment to highlight the salient features of PASQL and demonstrate the benefit of learning periodic policies over stationary policies.
Researcher Affiliation	Collaboration	Amit Sinha1, Matthieu Geist2, and Aditya Mahajan1 1Mc Gill University, Mila 2Cohere
Pseudocode	No	The paper describes algorithms (PASQL) and their update rules using mathematical notation, but it does not include any clearly labeled "Pseudocode" or "Algorithm" blocks with structured, step-by-step procedures.
Open Source Code	No	We intend to make the code open access after the review process is complete.
Open Datasets	No	The paper defines custom POMDP models (Example 1 and Example 2) within the text for its numerical experiments, rather than using or providing access to external, publicly available datasets. For instance, "Example 1 Consider a POMDP with S = {0, 1, . . . , 5}, A = {0, 1}, Y = {0, 1} and γ = 0.9. The dynamics are as shown in Fig. 2."
Dataset Splits	No	The paper mentions running experiments for "25 random seeds" but does not specify explicit training, validation, and test dataset splits in terms of percentages or sample counts for any dataset.
Hardware Specification	No	The numerical experiments were enabled in part by support provided by Calcul Québec and Compute Canada.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x) that would allow for reproducible setup of the environment.
Experiment Setup	Yes	The hyperparameters for the numerical experiments presented in Sec. 3 are shown in App. H. Table 3: Hyperparameters used in Ex. 1 Parameter Value Training steps 10^6 Start learn rate 10^-3 End learn rate 10^-5 Learn rate schedule Exponential Exponential decay rate 1.0 Number of random seeds 25