Inference Fusion with Associative Semantics for Unseen Object Detection

Authors: Yanan Li, Pengyang Li, Han Cui, Donghui Wang1993-2001

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that our inductive method can significantly boost the performance by 7.42% over inductive models, and even 5.25% over transductive models on MSCOCO dataset.
Researcher Affiliation Collaboration 1Zhejiang Lab 2Zhejiang University 3University of California, Los Angeles
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes The code is available https://github.com/Lppy/DPIF.
Open Datasets Yes We evaluate the proposed method with three well-known detection benchmarks, Pascal VOC (Everingham et al. 2010), MSCOCO (Lin et al. 2014) and Visual Genome (Krishna et al. 2017).
Dataset Splits Yes For MSCOCO, we choose the 65/15 source/target split (Rahman, Khan, and Barnes 2019)... In the training stage, we first train the standard Faster R-CNN framework on the training set... In the second stage, we modify the classification branch to our proposed two parallel reasoning branches with randomly initialized weights. We fine-tune only the classification and association prediction networks, while keeping the entire feature extractor and box regression network fixed.
Hardware Specification Yes For MSCOCO, the training stage takes about 20 hours with 2 TITAN V GPUs.
Software Dependencies No The paper states, "We implement our model using Pytorch." However, it does not specify version numbers for PyTorch or any other software dependencies, which are required for a reproducible description.
Experiment Setup Yes K and α in multi-associative construction are set to 5 and 0.1, respectively... We use the SGD optimizer (Bottou 2010) with a batch size of 14 in the first step and a batch size of 18 in the second step, exponential decay rates of 0.9 and 0.999, weight decay of 0.0001 and a learning rate of 0.01 to train our model. In the inference stage, we apply NMS with a threshold of 0.7 to RPN to generate object proposals and NMS with a threshold of 0.3 on the predicted boxes to obtain the final detection results.