reproducibilityindex.ai

The Update-Equivalence Framework for Decision-Time Planning

Authors: Samuel Sokota, Gabriele Farina, David J Wu, Hengyuan Hu, Kevin A. Wang, J Zico Kolter, Noam Brown

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we add further evidence for the update-equivalence framework s utility by showing that the novel DTP algorithms derived from it also perform well in practice. We focus on two settings with imperfect information: i) two variants of Hanabi (Bard et al., 2020), a fully cooperative card game in which PBS-based DTP approaches are considered state-of-the-art; and ii) 3x3 Abrupt Dark Hex and Phantom Tic-Tac-Toe, 2p0s games with virtually no public information.
Researcher Affiliation	Collaboration	Samuel Sokota 1 Gabriele Farina 2 David J. Wu Hengyuan Hu3 Kevin A. Wang 4 J. Zico Kolter1,5 Noam Brown 6 Work done at Meta AI 1Carnegie Mellon University 2Massachusetts Institute of Technology 3Stanford University 4Brown University 5Bosch AI 6Open AI
Pseudocode	Yes	Algorithm 1 Update Equivalent Search for Last-Iterate Algorithm with Action-Value Feedback and Update U
Open Source Code	No	The paper does not provide an explicit statement about releasing its source code or a link to a code repository for the methodology described.
Open Datasets	Yes	We test MDS in Hanabi (Bard et al., 2020), the standard benchmark for search in fully cooperative imperfect-information games.
Dataset Splits	No	The paper describes training models and using standard benchmarks, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages or sample counts) for reproducibility.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions various software components and frameworks like PPO, DQN, NFSP, and Open Spiel, but it does not provide specific version numbers for these software dependencies (e.g., 'Open Spiel vX.Y.Z').
Experiment Setup	Yes	For our Hanabi experiments, we used η = 20 for the MDS results in Tables 1 and 2. We performed search with 10,000 samples.