reproducibilityindex.ai

Modulating early visual processing by language

Authors: Harm de Vries, Florian Strub, Jeremie Mary, Hugo Larochelle, Olivier Pietquin, Aaron C. Courville

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply CBN to a pre-trained Residual Network (Res Net), leading to the MODulat Ed Res Net (MODERN) architecture, and show that this signiﬁcantly improves strong baselines on two visual question answering tasks. Our ablation study conﬁrms that modulating from the early stages of the visual processing is beneﬁcial.
Researcher Affiliation	Collaboration	Harm de Vries University of Montreal mail@harmdevries.com Florian Strub Univ. Lille, CNRS, Centrale Lille, Inria, UMR 9189 CRISt AL florian.strub@inria.fr Jérémie Mary Univ. Lille, CNRS, Centrale Lille, Inria, UMR 9189 CRISt AL jeremie.mary@univ-lille3.fr Hugo Larochelle Google Brain hugolarochelle@google.com Olivier Pietquin Deep Mind pietquin@google.com Aaron Courville University of Montreal aaron.courville@gmail.com
Pseudocode	No	No explicit pseudocode or algorithm blocks are provided.
Open Source Code	Yes	The source code for our experiments is available at https://github.com/Guess What Game.
Open Datasets	Yes	In this paper, we focus on VQAv1 dataset [1], which contains 614K questions on 204K images.
Dataset Splits	Yes	We train on the training set, do early-stopping on the validation set, and report the accuracies on the test-dev using the evaluation script provided by [1].
Hardware Specification	Yes	We thank NVIDIA for providing access to a DGX-1 machine used in this work.
Software Dependencies	No	The paper mentions software components like LSTM, GRU, and ResNet, but does not provide specific version numbers for any software libraries or frameworks used.
Experiment Setup	Yes	The hyperparameters are also provided in Appendix A.