reproducibilityindex.ai

Composable Planning with Attributes

Authors: Amy Zhang, Sainbayar Sukhbaatar, Adam Lerer, Arthur Szlam, Rob Fergus

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate compositional planning in several environments. We ﬁrst consider 3D block stacking, and show that we can compose single-action tasks seen during training to perform multi-step tasks. Second, we plan over multi-step policies in 2-D grid world tasks. Finally, we see how our approach scales to a unit-building task in Star Craft.
Researcher Affiliation	Collaboration	1Facebook AI Research, New York, NY, USA 2New York University, New York, NY, USA.
Pseudocode	Yes	Algorithm 1 Attribute Planner Training
Open Source Code	No	The paper does not contain an explicit statement about the release of its source code or a link to a code repository for the described methodology.
Open Datasets	No	The paper uses environments like Mujoco, Mazebase, and Star Craft for experiments, generating data within them ('Each training episode is initiated from a random initial state and lasts only one step'), but does not provide concrete access information (link, DOI, specific citation with authors/year for a public dataset) for the data used.
Dataset Splits	No	The paper mentions training on '1 million examples' or '10,000 examples' for the attribute detector, but it does not specify explicit training/validation/test dataset splits (e.g., exact percentages or sample counts for each split).
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions several environments and platforms like Mujoco, Mazebase, and Star Craft with citations, but it does not specify software dependencies (e.g., programming language, libraries, or frameworks) with version numbers.
Experiment Setup	Yes	Models are trained for a total of 30 million steps. AP uses 16 million steps for exploration and 14 million steps for training. ... During the ﬁnal phase of training we simultaneously compute and c , so we use an exponentially decaying average of the success rate of to deal with it s nonstationarity: c ( i, j) = t=1 γT t St t=1 γT t At ( i, j) where T is the number of training epochs, At is the number of attempted transitions ( i, j) during epoch t, and St is the number of successful transitions. A decay rate of γ = 0.9 is used.