A Logic-based Approach to Contrastive Explainability for Neurosymbolic Visual Question Answering
Authors: Thomas Eiter, Tobias Geibinger, Nelson Higuera, Johannes Oetsch
IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our approach on the CLEVR dataset, which we extend by more sophisticated questions to further demonstrate the robustness of the modular architecture. While we achieve top performance compared to related approaches, we can also produce CEs for explanation, model debugging, and validation tasks, showing the versatility of the declarative approach to reasoning. and 4 Evaluation Prior to an evaluation of our CE approach, we test the accuracy of NSVQASP on CLEVR and compare it against NS-VQA and other baseline approaches. |
| Researcher Affiliation | Academia | Thomas Eiter , Tobias Geibinger , Nelson Higuera and Johannes Oetsch Institute for Logic and Computation, TU Wien, Favoritenstraße 9 11, 1040 Vienna, Austria, {thomas.eiter, tobias.geibinger, nelson.ruiz, johannes.oetsch}@tuwien.ac.at |
| Pseudocode | No | The paper describes the logic and ASP rules but does not contain a dedicated section, figure, or block explicitly labeled 'Pseudocode' or 'Algorithm'. |
| Open Source Code | Yes | Code and data are available from https://github.com/ pudumagico/nsvqasp. |
| Open Datasets | Yes | We validate our approach on the CLEVR dataset, which we extend by several more sophisticated questions to further demonstrate the robustness of the modular architecture of NSVQASP. In particular, we add 20 new question templates for different versions of the new spatial relation between , equality of objects, and counting. While we achieve top performance compared to related neural and neurosymbolic approaches, we can moreover produce CEs. We show this for model explanation, debugging, and validation tasks, demonstrating the versatility of the declarative approach to reasoning within modular neurosymbolic VQA architectures. Code and data are available from https://github.com/ pudumagico/nsvqasp. |
| Dataset Splits | Yes | The CLEVR dataset consists of 70k images plus 700k questions for training and 15k images plus 150k questions for validation. Questions are generated from templates which define the structure of a question. We extend the CLEVR dataset by introducing 20 new templates that include a new spatial relation between , questions regarding equality of objects, and new counting questions, respectively; consequently, they can be divided into three groups. We generated 200k new questions from the templates for training for each group and 150k questions for validation. |
| Hardware Specification | Yes | We use an Intel Core i7-12700K, 32GB RAM, and an NVIDIA Ge Force RTX 3080 Ti for training. |
| Software Dependencies | Yes | We use clingo (v. 5.6.2 ) [Gebser et al., 2019] with unsatisfiable core-guided optimisation [Andres et al., 2012]. |
| Experiment Setup | No | The paper mentions using YOLOv5 and LSTM, and that 'YOLOv5 was trained with the CLEVR mini dataset', but it does not specify concrete hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations for these models. |