reproducibilityindex.ai

Compositional Automata Embeddings for Goal-Conditioned Reinforcement Learning

Authors: Beyazit Yalcinkaya, Niklas Lauffer, Marcell Vazquez-Chanlatte, Sanjit Seshia

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through empirical evaluation, we demonstrate that the proposed pre-training method enables zero-shot generalization to various c DFA task classes and accelerated policy specialization without the myopic suboptimality of hierarchical methods.
Researcher Affiliation	Collaboration	Beyazit Yalcinkaya University of California, Berkeley... Niklas Lauffer University of California, Berkeley... Marcell Vazquez-Chanlatte Nissan Advanced Technology Center... Sanjit A. Seshia University of California, Berkeley
Pseudocode	Yes	Algorithm 1 RAD c DFA Sampler
Open Source Code	Yes	For more information about the project, visit: https://rad-embeddings.github.io/.
Open Datasets	Yes	Letterworld Environment. Introduced in [4, 38], Letterworld is a 7x7 grid where the agent occupies one square at a time.
Dataset Splits	No	The paper describes the training and evaluation procedures in the context of reinforcement learning, which involves training an agent through interaction with an environment and evaluating performance on sampled tasks/episodes. It does not explicitly define traditional training, validation, and test splits for a static dataset.
Hardware Specification	Yes	Each seed in the experiments section was run as an individual Slurm job with access to 4 cores of an AMD EPYC 7763 running at 2.45GHz and access to at most 20gb of memory.
Software Dependencies	No	The paper mentions specific algorithms and network architectures like GATv2 [9], RGCN [34], and PPO [35], but does not provide version numbers for any specific software libraries or dependencies (e.g., PyTorch, TensorFlow versions).
Experiment Setup	Yes	Table 1 shows the hyperparameters used for every training run in the experiments section.