reproducibilityindex.ai

Reclaiming the Source of Programmatic Policies: Programmatic versus Latent Spaces

Authors: Tales Henrique Carvalho, Kenneth Tjhia, Levi Lelis

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper, we show that the programmatic space, induced by the domain-specific language and requiring no training, presents values for the behavior loss similar to those observed in latent spaces presented in previous work. Moreover, algorithms searching in the programmatic space significantly outperform those in LEAPS and HPRL. To explain our results, we measured the friendliness of the two spaces to local search algorithms.
Researcher Affiliation	Academia	Tales H. Carvalho, Kenneth Tjhia, Levi H. S. Lelis Amii, Department of Computing Science, University of Alberta {taleshen,tjhia,levi.lelis}@ualberta.ca
Pseudocode	Yes	Algorithm 1 Hill Climbing for Programmatic Policies
Open Source Code	Yes	The codebase used in this work is available online.1
Open Datasets	Yes	We consider the KAREL and KAREL-HARD problem sets to define tasks. The KAREL set contains the tasks STAIRCLIMBER, MAZE, FOURCORNERS, TOPOFF, HARVESTER and CLEANHOUSE, all introduced by Trivedi et al. (2021).
Dataset Splits	No	The paper evaluates policies based on expected return over a set of initial states and describes various search algorithms, but it does not specify a separate validation dataset split.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for conducting the experiments.
Software Dependencies	No	The paper describes the problem domain and the algorithms used, but it does not list specific software dependencies with their version numbers (e.g., Python version, library versions like PyTorch or TensorFlow).
Experiment Setup	Yes	For CEBS, we set the dimension of the latent vector d = 256, the neighborhood size K = 64, the elite size E = 16, and the noise σ = 0.25. The hyperparameters for CEM and HPRL are exactly as described in their papers.