VECA: A New Benchmark and Toolkit for General Cognitive Development

Authors: Kwanyoung Park, Hyunseok Oh, Youngki Lee38-48

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We assess several representative RL algorithms with our VECA benchmark, including policy gradient methods (Espeholt et al. 2018; Schulman et al. 2017; Haarnoja et al. 2018) and curiosity-driven learning (Burda et al. 2019), and find that there is still a long way to go to reach the human-level cognitive capabilities. Experimental results show that goal-driven learning (IMPALA, the policy gradient method) initially outperforms the unsupervised exploration without explicit reward (curiosity-driven learning), but it prematurely converges and marginally improves from a random policy.
Researcher Affiliation Academia Department of Computer Science and Engineering, Seoul National University, South Korea william202@snu.ac.kr, ohsai@snu.ac.kr, youngkilee@snu.ac.kr
Pseudocode No The paper describes its methodology and components but does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes Our VECA environment and benchmark are available to public for research purposes at https://github.com/snuhcs/veca/.
Open Datasets Yes Our VECA environment and benchmark are available to public for research purposes at https://github.com/snuhcs/veca/.
Dataset Splits No The paper does not specify fixed train/validation/test dataset splits. It states that agents are trained with 'the entire VECA tasks, which are uniformly sampled at random per episode', implying data is generated dynamically within the environment rather than using pre-defined splits.
Hardware Specification Yes To run VECA Unity3D application, we use Intel(R) Core(TM) i7-6700K with 32GB RAM on Windows 10 OS. We use Xeon Gold 5218 CPU with 256GB RAM and four NVIDIA TITAN XP 12GBs on a Ubuntu 16.04 to train the agent algorithm.
Software Dependencies No The paper mentions using 'Unity3D game engine' and 'Ubuntu 16.04' as the operating system, and RL algorithms like IMPALA, PPO, SAC, and CUR, but does not provide specific version numbers for software dependencies like Unity3D, Python, or the deep learning frameworks used (e.g., PyTorch, TensorFlow).
Experiment Setup Yes We sample binocular RGB vision data in a resolution of 84x84. No blur or grayscaling is applied to the agent s vision. We sample the audio data at the rate of 22050Hz and converted the audio data to frequency-domain by FFT with the window size of 1024. Mimimum audible distance d T H = 20, threshold of tactile sensory value δ = 0.05, decay rate of tactile sensory value λ = 0.0. Tactile input has a dimension of 3296. Table 3: Hyperparameter setup for each RL algorithm.