reproducibilityindex.ai

Learning to Specialize with Knowledge Distillation for Visual Question Answering

Authors: Jonghwan Mun, Kimin Lee, Jinwoo Shin, Bohyung Han

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results indeed demonstrate that our method outperforms other baselines for VQA and image classiﬁcation.
Researcher Affiliation	Academia	1Computer Vision Lab., POSTECH, Pohang, Korea 2Algorithmic Intelligence Lab., KAIST, Daejeon, Korea 3Computer Vision Lab., ASRI, Seoul National University, Seoul, Korea
Pseudocode	No	The paper describes the training procedure in text (e.g., "Training procedure of MCL-KD is as follows...") but does not present it as structured pseudocode or a labeled algorithm block.
Open Source Code	No	The paper refers to publicly available implementations of baseline models (bottom-up and top-down attention model and CMCL) but does not state that the code for the proposed MCL-KD method is released or provide a link to it.
Open Datasets	Yes	We employ CLEVR and VQA v2.0 datasets to validate our algorithm. CLEVR [14] is constructed for an analysis of various aspects of visual reasoning... VQA v2.0 [9] is a very popular dataset based on images collected from MSCOCO [24].
Dataset Splits	Yes	CLEVR [14]... is composed of 70,000 training images with 699,989 questions and 15,000 validation images with 149,991 questions... VQA v2.0 [9]... consists of 443,757 and 214,354 questions for train and validation, respectively
Hardware Specification	No	The paper mentions training models and memory limitations (batch size change from 512 to 256) but does not specify any details about the hardware used, such as CPU or GPU models.
Software Dependencies	No	The paper mentions using ADAM optimizer and ResNet-101, and refers to external implementations for baselines, but does not specify version numbers for any software dependencies like deep learning frameworks or programming languages.
Experiment Setup	Yes	All models are optimized using ADAM [17] with ﬁxed learning rate of 0.0005 and batch size of 64 while the parameters of Res Net-101 are ﬁxed. We set β and T in Eq. 4 to 50 and 0.1, respectively, based on our empirical observations.