Learning Visual Abstract Reasoning through Dual-Stream Networks

Authors: Kai Zhao, Chang Xu, Bailu Si

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct comprehensive empirical studies on several RPM benchmarks. To summarize, our contributions include: ... DRNet achieves remarkable generalization performance and outperforms other models on multiple datasets, showcasing the effectiveness of this framework for nonverbal visual abstract reasoning problems.
Researcher Affiliation Academia Kai Zhao1, Chang Xu1, Bailu Si1, 2* 1School of Systems Science, Beijing Normal University 2Chinese Institute for Brain Research, Beijing zhaokai id@foxmail.com, changxu@mail.bnu.edu.cn, bailusi@bnu.edu.cn
Pseudocode No No pseudocode or clearly labeled algorithm blocks were found.
Open Source Code Yes Codes are available at https://github.com/Vecchio ID/DRNet.
Open Datasets Yes The PGM dataset (Santoro et al. 2018) comprises 1.2 million training samples, 20 thousand validation samples, and 200 thousand test samples. ... The RAVEN dataset (Zhang et al. 2019a), along with its variants I-RAVEN (Hu et al. 2021) and RAVEN-FAIR (Benny, Pekar, and Wolf 2021), are compact datasets, each comprising 7 configurations, with each configuration containing 10,000 samples.
Dataset Splits Yes The PGM dataset (Santoro et al. 2018) comprises 1.2 million training samples, 20 thousand validation samples, and 200 thousand test samples. ... The distribution ratio across the training, test, and validation sets is 6:2:2.
Hardware Specification No No specific hardware details (e.g., GPU model, CPU model, memory) used for running experiments were provided.
Software Dependencies No The paper mentions using Adam (Da 2014) optimizer but does not specify version numbers for any software, libraries, or programming languages.
Experiment Setup Yes We utilize a standard batch size of 256 and evaluate the reported accuracy on the test set using the best validation accuracy checkpoint. The same set of hyperparameters is applied across all benchmarks, employing Adam (Da 2014) optimizer with a learning rate of 3e-4, β values of (0.9, 0.999), and a weight decay of 1e-6. No additional supervision signals, such as metadata, are utilized during training. Additionally, for RAVEN-style datasets, we present the median outcome from 5 distinct runs. ... In the experimental results presented below, when the validation loss no longer decreases within 20 epochs, we perform early stopping.