Weakly Supervised Multimodal Affordance Grounding for Egocentric Images
Authors: Lingjing Xu, Yang Gao, Wenfeng Song, Aimin Hao
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive experiments, we demonstrate the superiority of our proposed method in terms of evaluation metrics and visual results when compared to existing affordance grounding models. Furthermore, ablation experiments confirm the effectiveness of our approach. |
| Researcher Affiliation | Academia | 1State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, China 2Computer School, Beijing Information Science and Technology University, China |
| Pseudocode | No | The paper does not contain any clearly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Code:https://github.com/xulingjing88/WSMA. |
| Open Datasets | Yes | We use the Affordance Grounding Dataset (AGD20K) (Luo et al. 2022b), which is a comprehensive dataset containing various viewpoints, specifically, 20,061 exocentric and 3,755 egocentric images. These images represent 36 unique affordance categories. We conduct evaluations under two distinct settings: Seen and Unseen . In addition to AGD20K, we have assembled a new dataset, HICO-IIF, by selecting specific subsets from the HICO-DET (Chao et al. 2018) and IIT-AFF (Nguyen et al. 2017) datasets. |
| Dataset Splits | No | The paper mentions 'Seen' and 'Unseen' settings for evaluation, but does not provide specific train/validation/test dataset splits (e.g., percentages or sample counts) needed to reproduce the data partitioning. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used for running its experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions using DINO-ViT and CLIP as backbones, but it does not specify software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | We set the hyperparameters λcls, λclip, λd, and λl rela to 1, 1, 0.5, and 0.5 respectively, while the threshold is fixed at 0.2. Further details regarding parameter configurations can be found in the Appendix. |