reproducibilityindex.ai

VECA: A New Benchmark and Toolkit for General Cognitive Development

Authors: Kwanyoung Park, Hyunseok Oh, Youngki Lee38-48

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We assess several representative RL algorithms with our VECA benchmark, including policy gradient methods (Espeholt et al. 2018; Schulman et al. 2017; Haarnoja et al. 2018) and curiosity-driven learning (Burda et al. 2019), and ﬁnd that there is still a long way to go to reach the human-level cognitive capabilities. Experimental results show that goal-driven learning (IMPALA, the policy gradient method) initially outperforms the unsupervised exploration without explicit reward (curiosity-driven learning), but it prematurely converges and marginally improves from a random policy.
Researcher Affiliation	Academia	Department of Computer Science and Engineering, Seoul National University, South Korea william202@snu.ac.kr, ohsai@snu.ac.kr, youngkilee@snu.ac.kr
Pseudocode	No	The paper describes its methodology and components but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Our VECA environment and benchmark are available to public for research purposes at https://github.com/snuhcs/veca/.
Open Datasets	Yes	Our VECA environment and benchmark are available to public for research purposes at https://github.com/snuhcs/veca/.
Dataset Splits	No	The paper does not specify fixed train/validation/test dataset splits. It states that agents are trained with 'the entire VECA tasks, which are uniformly sampled at random per episode', implying data is generated dynamically within the environment rather than using pre-defined splits.
Hardware Specification	Yes	To run VECA Unity3D application, we use Intel(R) Core(TM) i7-6700K with 32GB RAM on Windows 10 OS. We use Xeon Gold 5218 CPU with 256GB RAM and four NVIDIA TITAN XP 12GBs on a Ubuntu 16.04 to train the agent algorithm.
Software Dependencies	No	The paper mentions using 'Unity3D game engine' and 'Ubuntu 16.04' as the operating system, and RL algorithms like IMPALA, PPO, SAC, and CUR, but does not provide specific version numbers for software dependencies like Unity3D, Python, or the deep learning frameworks used (e.g., PyTorch, TensorFlow).
Experiment Setup	Yes	We sample binocular RGB vision data in a resolution of 84x84. No blur or grayscaling is applied to the agent s vision. We sample the audio data at the rate of 22050Hz and converted the audio data to frequency-domain by FFT with the window size of 1024. Mimimum audible distance d T H = 20, threshold of tactile sensory value δ = 0.05, decay rate of tactile sensory value λ = 0.0. Tactile input has a dimension of 3296. Table 3: Hyperparameter setup for each RL algorithm.