reproducibilityindex.ai

Joint Modeling of Visual Objects and Relations for Scene Graph Generation

Authors: Minghao Xu, Meng Qu, Bingbing Ni, Jian Tang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on both the relationship retrieval and zero-shot relationship retrieval tasks prove the efﬁciency and efﬁcacy of our proposed approach.
Researcher Affiliation	Academia	1Shanghai Jiao Tong University, Shanghai 200240, China 2Mila Québec AI Institute 3University of Montréal 4HEC Montréal 5CIFAR AI Research Chair {xuminghao118, nibingbing}@sjtu.edu.cn meng.qu@umontreal.ca jian.tang@hec.ca
Pseudocode	Yes	Algorithm 1 Inference algorithm of JM-SGG.
Open Source Code	No	Our method is implemented under Py Torch [25], and the source code will be released for reproducibility.
Open Datasets	Yes	We use the Visual Genome (VG) dataset [16] (CC BY 4.0 License), a large-scale database with structured image concepts, for evaluation. We use the pre-processed VG from Xu et al. [48] (MIT License) which contains 108k images with 150 object categories and 50 relation types.
Dataset Splits	Yes	Following previous works [53, 36, 37], we employ the original split with 70% images for training and 30% images for test, and 5k images randomly sampled from the training split are held out for validation.
Hardware Specification	Yes	An NVIDIA Tesla V100 GPU is used for training.
Software Dependencies	No	Our method is implemented under Py Torch [25]. The paper mentions PyTorch but does not specify a version number or other software dependencies with version information.
Experiment Setup	Yes	In our experiments, the object detector is ﬁrst pre-trained by an SGD optimizer (batch size: 4, initial learning rate: 0.001, momentum: 0.9, weight decay: 5 10 4) for 20 epochs, and the learning rate is multiplied by 0.1 after the 10th epoch. During maximum likelihood learning, we train the potential functions and ﬁne-tune the object detector with another SGD optimizer (batch size: 4, potential function learning rate: 0.001, detector learning rate: 0.0001, momentum: 0.9, weight decay: 5 10 4) for 10 epochs, and the learning rate is multiplied by 0.1 after the 5th epoch. Without otherwise speciﬁed, the iteration number NT is set as 1 for training and 2 for test, and the per image sampling size NS is set as 3.