Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Reclaiming the Source of Programmatic Policies: Programmatic versus Latent Spaces
Authors: Tales Henrique Carvalho, Kenneth Tjhia, Levi Lelis
ICLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we show that the programmatic space, induced by the domain-specific language and requiring no training, presents values for the behavior loss similar to those observed in latent spaces presented in previous work. Moreover, algorithms searching in the programmatic space significantly outperform those in LEAPS and HPRL. To explain our results, we measured the friendliness of the two spaces to local search algorithms. |
| Researcher Affiliation | Academia | Tales H. Carvalho, Kenneth Tjhia, Levi H. S. Lelis Amii, Department of Computing Science, University of Alberta EMAIL |
| Pseudocode | Yes | Algorithm 1 Hill Climbing for Programmatic Policies |
| Open Source Code | Yes | The codebase used in this work is available online.1 |
| Open Datasets | Yes | We consider the KAREL and KAREL-HARD problem sets to define tasks. The KAREL set contains the tasks STAIRCLIMBER, MAZE, FOURCORNERS, TOPOFF, HARVESTER and CLEANHOUSE, all introduced by Trivedi et al. (2021). |
| Dataset Splits | No | The paper evaluates policies based on expected return over a set of initial states and describes various search algorithms, but it does not specify a separate validation dataset split. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for conducting the experiments. |
| Software Dependencies | No | The paper describes the problem domain and the algorithms used, but it does not list specific software dependencies with their version numbers (e.g., Python version, library versions like PyTorch or TensorFlow). |
| Experiment Setup | Yes | For CEBS, we set the dimension of the latent vector d = 256, the neighborhood size K = 64, the elite size E = 16, and the noise σ = 0.25. The hyperparameters for CEM and HPRL are exactly as described in their papers. |