One-Shot Affordance Detection
Authors: Hongchen Luo, Wei Zhai, Jing Zhang, Yang Cao, Dacheng Tao
IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate the superiority of our model over previous representative ones in terms of both objective metrics and visual quality. The benchmark suite is at Project Page. |
| Researcher Affiliation | Collaboration | Hongchen Luo1 , Wei Zhai1,3 , Jing Zhang2 , Yang Cao1 , Dacheng Tao3 1 University of Science and Technology of China, China 2 The University of Sydney, Australia 3 JD Explore Academy, JD.com, China |
| Pseudocode | No | The paper describes its methodology using mathematical equations and text, and illustrates network architectures with diagrams, but it does not include formal pseudocode or algorithm blocks. |
| Open Source Code | No | The abstract mentions 'The benchmark suite is at Project Page.' but does not provide a direct link to a source-code repository or explicitly state that the code for the described methodology is publicly available. |
| Open Datasets | Yes | We construct the Purpose-driven Affordance Dataset (PAD) with images mainly from ILSVRC [Russakovsky et al., 2015], COCO [Lin et al., 2014], etc. |
| Dataset Splits | Yes | To benchmark different models comprehensively, we follow the k-fold evaluation protocol, where k is 3 in this paper. To this end, the dataset is divided into three parts with non-overlapped categories, where any two of them are used for training while the left part is used for testing. See the supplementary material for more details about the setting. |
| Hardware Specification | Yes | We train the model for 40 epochs on a single NVIDIA 1080ti GPU with an initial learning rate 1e-4. |
| Software Dependencies | No | Our method is implemented in Pytorch and trained with the Adam optimizer [Kingma and Ba, 2014]. The backbone is resnet50 [He et al., 2016]. No version numbers are specified for Pytorch or other software dependencies. |
| Experiment Setup | Yes | We train the model for 40 epochs on a single NVIDIA 1080ti GPU with an initial learning rate 1e-4. The number of bases in the collaboration enhancement module is set to K=256. The number of E-M iteration steps is 3. Besides, two segmentation models (UNet [Ronneberger et al., 2015], PSPNet [Zhao et al., 2017]), three saliency detection models (CPD [Wu et al., 2019], BASNet [Qin et al., 2019], CSNet [Gao et al., 2020]) and one co-saliency detection models (Co EGNet [Fan et al., 2021]) are chosen for comparison. |