Hierarchical ConViT with Attention-Based Relational Reasoner for Visual Analogical Reasoning
Authors: Wentao He, Jialu Zhang, Jianfeng Ren, Ruibin Bai, Xudong Jiang
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on three RPM datasets demonstrate that the proposed HCV-ARR achieves a significant performance gain compared with the state-of-the-art models. |
| Researcher Affiliation | Academia | 1The Digital Port Technologies Lab, School of Computer Science, University of Nottingham Ningbo China 2Nottingham Ningbo China Beacons of Excellence Research and Innovation Institute, University of Nottingham Ningbo China 3School of Electrical & Electronic Engineering, Nanyang Technological University |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The source code is available at: https://github.com/wentaoheunnc/HCV-ARR. |
| Open Datasets | Yes | The proposed model is evaluated on the RAVEN (Zhang et al. 2019a), I-RAVEN (Hu et al. 2021) and RAVEN-FAIR datasets (Benny, Pekar, and Wolf 2021). |
| Dataset Splits | Yes | Each dataset is randomly split into 10 folds, with 6 folds for training, 2 folds for validation and 2 folds for testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions using the Adam optimizer but does not specify any software names with version numbers required to replicate the experiment. |
| Experiment Setup | Yes | The number of Con Vi T blocks is set to N = 3. Following the settings in (Benny, Pekar, and Wolf 2021; Zhang et al. 2019a,b), the input images are resized to 80 80 pixels. The maximum number of epochs is 200, and the training is stopped if there is no significant improvement on the validation set over 20 epochs. During training, the learning rate is set to 0.001, and the Adam optimizer is utilized with a weight decay of 1 10 5. The batch size is set to 32. |