Deconfounded Visual Grounding
Authors: Jianqiang Huang, Yu Qin, Jiaxin Qi, Qianru Sun, Hanwang Zhang998-1006
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On popular benchmarks, RED improves various state-of-the-art grounding methods by a significant margin. |
| Researcher Affiliation | Collaboration | 1Nanyang Technological University, Singapore 2Damo Academy, Alibaba Group 3Singapore Management University |
| Pseudocode | Yes | Algorithm 1: Visual Grounding with RED |
| Open Source Code | Yes | Code is available at: https://github.com/JianqiangH/DeconfoundedVG. |
| Open Datasets | Yes | Ref COCO, Ref COCO+ and Ref COCOg are three visual grounding benchmarks and their images are from MS-COCO (Lin et al. 2014). |
| Dataset Splits | Yes | Ref COCO (Yu et al. 2016) has ... is split into train/ validation/ test A/ test B with 120,624/ 10,834/ 5,657/ 5,095 images, respectively. |
| Hardware Specification | Yes | Under fair settings, we test the speed of Yang s-V1 and Yang s-V1+RED on a single Tesla V100. |
| Software Dependencies | No | The paper mentions using BERT and K-Means but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | We deployed the K-Means algorithm to cluster those into N = 10 clusters forming the confounder dictionary Dg in Eq. (7)." and "After N exceeding 10, the performance won t show further improvement, thus we set N = 10. |