Flow-based Intrinsic Curiosity Module
Authors: Hsuan-Kung Yang, Po-Han Chiang, Min-Fong Hong, Chun-Yi Lee
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method and compare it with a number of existing methods on multiple benchmark environments, including Atari games, Super Mario Bros., and Vi ZDoom. We demonstrate that FICM is favorable to tasks or environments featuring moving objects, which allow FICM to utilize the motion features between consecutive observations. We further ablatively analyze the encoding efficiency of FICM, and discuss its applicable domains comprehensively. |
| Researcher Affiliation | Academia | Hsuan-Kung Yang , Po-Han Chiang , Min-Fong Hong and Chun-Yi Lee Elsa Lab, Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan {hellochick, ymmoy999, romulus, cylee}@gapp.nthu.edu.tw |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | See here for our codes and demo videos. |
| Open Datasets | Yes | We validate the above properties of FICM in a variety of benchmark environments, including Atari 2600 [Bellemare et al., 2013], Super Mario Bros., and Vi ZDoom [Wydmuch et al., 2018]. |
| Dataset Splits | No | The paper uses benchmark environments and plots evaluation curves, but does not explicitly state specific train/validation/test dataset splits (e.g., percentages or sample counts) for its experiments. |
| Hardware Specification | No | The paper mentions 'the donation of the GPUs from NVIDIA Corporation and NVIDIA AI Technology Center' but does not specify the exact models or specifications of the GPUs or other hardware used for the experiments. |
| Software Dependencies | No | The paper states that 'all of the experiments are conducted with proximal policy optimization (PPO) [Schulman et al., 2017]' and that DRL agents are 'based on Asynchronous Advantage Actor-Critic (A3C) [Mnih et al., 2016]', but it does not provide specific version numbers for these or any other software libraries or dependencies. |
| Experiment Setup | No | The paper mentions using PPO and A3C, and discussing input types (RGB vs. gray-scale, stacked vs. non-stacked frames), but it does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs, optimizer settings) or detailed system-level training configurations used in their experiments. |