Hierarchical ConViT with Attention-Based Relational Reasoner for Visual Analogical Reasoning

Authors: Wentao He, Jialu Zhang, Jianfeng Ren, Ruibin Bai, Xudong Jiang

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on three RPM datasets demonstrate that the proposed HCV-ARR achieves a significant performance gain compared with the state-of-the-art models.
Researcher Affiliation Academia 1The Digital Port Technologies Lab, School of Computer Science, University of Nottingham Ningbo China 2Nottingham Ningbo China Beacons of Excellence Research and Innovation Institute, University of Nottingham Ningbo China 3School of Electrical & Electronic Engineering, Nanyang Technological University
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes The source code is available at: https://github.com/wentaoheunnc/HCV-ARR.
Open Datasets Yes The proposed model is evaluated on the RAVEN (Zhang et al. 2019a), I-RAVEN (Hu et al. 2021) and RAVEN-FAIR datasets (Benny, Pekar, and Wolf 2021).
Dataset Splits Yes Each dataset is randomly split into 10 folds, with 6 folds for training, 2 folds for validation and 2 folds for testing.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions using the Adam optimizer but does not specify any software names with version numbers required to replicate the experiment.
Experiment Setup Yes The number of Con Vi T blocks is set to N = 3. Following the settings in (Benny, Pekar, and Wolf 2021; Zhang et al. 2019a,b), the input images are resized to 80 80 pixels. The maximum number of epochs is 200, and the training is stopped if there is no significant improvement on the validation set over 20 epochs. During training, the learning rate is set to 0.001, and the Adam optimizer is utilized with a weight decay of 1 10 5. The batch size is set to 32.