Leveraging Low-Rank and Sparse Recurrent Connectivity for Robust Closed-Loop Control
Authors: Neehal Tumma, Mathias Lechner, Noel Loo, Ramin Hasani, Daniela Rus
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We study the offline-online generalization gap of various recurrent architectures parameterized by low-rank, sparse connectivity under an imitation learning framework. In particular, we measure the performance of our models in the Arcade Learning Environment (Bellemare et al., 2012) and Mu Jo Co (Todorov et al., 2012). For ALEs, we run experiments in the Seaquest and Alien environments and for Mu Jo Co we explore the Half Cheetah environment. |
| Researcher Affiliation | Academia | Neehal Tumma1 Mathias Lechner2 Noel Loo2 Ramin Hasani2 Daniela Rus2 1Harvard University 2MIT CSAIL |
| Pseudocode | No | The paper provides functional forms and mathematical derivations but no explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The reproducibility statement describes how to reproduce the initialization scheme but does not explicitly state that the full methodology's source code is open-source or provide a link. |
| Open Datasets | Yes | Within the set of ALEs, we considered the Seaquest and Alien environments. [...] Within the set of Mu Jo Co environments, we considered the Half Cheetah environment. |
| Dataset Splits | Yes | Note that only offline performance on the validation/test set was used to determine the best performing model in the grid search. [...] Trained models were evaluated offline on a validation/test set with respect to their cross-entropy loss in the case of ALE networks and mean-squared error in the case of Mu Jo Co networks. |
| Hardware Specification | No | The paper describes the software models and frameworks used (Ape-X DQN, RLlib, PPO) but does not provide specific details about the hardware (e.g., GPU/CPU models) used to run the experiments. |
| Software Dependencies | No | The paper mentions 'Tensor Flow (Abadi et al., 2016)' and 'RLlib (Liang et al., 2017)' but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | Below is a table with the model hyperparameters that were used in both the ALE networks and Mu Jo Co networks. Parameters in square brackets represent a grid search over which the best performing model was chosen. optimizer Adam (β1 = 0.9, β2 = 0.999) hidden size 64 learning rate [5 10 5, 1 10 4, 5 10 4] epochs 150 rank {1, 5, 16, 27, full} sparsity {0, 0.2, 0.5, 0.8} |