Inference Fusion with Associative Semantics for Unseen Object Detection
Authors: Yanan Li, Pengyang Li, Han Cui, Donghui Wang1993-2001
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that our inductive method can significantly boost the performance by 7.42% over inductive models, and even 5.25% over transductive models on MSCOCO dataset. |
| Researcher Affiliation | Collaboration | 1Zhejiang Lab 2Zhejiang University 3University of California, Los Angeles |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is available https://github.com/Lppy/DPIF. |
| Open Datasets | Yes | We evaluate the proposed method with three well-known detection benchmarks, Pascal VOC (Everingham et al. 2010), MSCOCO (Lin et al. 2014) and Visual Genome (Krishna et al. 2017). |
| Dataset Splits | Yes | For MSCOCO, we choose the 65/15 source/target split (Rahman, Khan, and Barnes 2019)... In the training stage, we first train the standard Faster R-CNN framework on the training set... In the second stage, we modify the classification branch to our proposed two parallel reasoning branches with randomly initialized weights. We fine-tune only the classification and association prediction networks, while keeping the entire feature extractor and box regression network fixed. |
| Hardware Specification | Yes | For MSCOCO, the training stage takes about 20 hours with 2 TITAN V GPUs. |
| Software Dependencies | No | The paper states, "We implement our model using Pytorch." However, it does not specify version numbers for PyTorch or any other software dependencies, which are required for a reproducible description. |
| Experiment Setup | Yes | K and α in multi-associative construction are set to 5 and 0.1, respectively... We use the SGD optimizer (Bottou 2010) with a batch size of 14 in the first step and a batch size of 18 in the second step, exponential decay rates of 0.9 and 0.999, weight decay of 0.0001 and a learning rate of 0.01 to train our model. In the inference stage, we apply NMS with a threshold of 0.7 to RPN to generate object proposals and NMS with a threshold of 0.3 on the predicted boxes to obtain the final detection results. |