Model-Based Episodic Memory Induces Dynamic Hybrid Controls

Authors: Hung Le, Thommen Karimpanal George, Majid Abdolshah, Truyen Tran, Svetha Venkatesh

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments demonstrate that our model allows significantly faster and better learning than other strong reinforcement learning agents across a variety of environments including stochastic and non-Markovian settings.
Researcher Affiliation Academia Hung Le, Thommen Karimpanal George, Majid Abdolshah, Truyen Tran, Svetha Venkatesh Applied AI Institute, Deakin University, Geelong, Australia thai.le@deakin.edu.au
Pseudocode Yes Algorithm 1 MBEC++: Complementary reinforcement learning with MBEC and DQN.
Open Source Code No The paper does not include an unambiguous statement that the authors are releasing the source code for the methodology described in this paper, nor does it provide a direct link to such a repository.
Open Datasets Yes We consider 3 classical problems: Cart Pole, Mountain Car and Lunar Lander.
Dataset Splits No The paper mentions training models and using replay buffers, but it does not specify explicit training/validation/test dataset splits (e.g., percentages or counts) for the environments used in the experiments.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions software components like DQN and LSTM implementations, but does not provide specific version numbers for any key software dependencies or libraries.
Experiment Setup Yes Details of the baseline configurations and hyper-parameter tuning for each tasks can be found in Appendix B.