HOI Analysis: Integrating and Decomposing Human-Object Interaction
Authors: Yong-Lu Li, Xinpeng Liu, Xiaoqian Wu, Yizhuo Li, Cewu Lu
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we first introduce the adopted datasets, metrics (Sec. 4.1) and implementation (Sec. 4.2). Next, we compare IDN with the state-of-the-art on HICO-DET [4] and V-COCO [18] in Sec. 4.3. As HOI detection metrics [4, 18] expect both accurate human/object locations and verb classification, the performance strongly relies on object detection. Hence, we conduct experiments to evaluate IDN with different object detectors. At last, ablation studies are conducted (Sec. 4.5). |
| Researcher Affiliation | Academia | Yong-Lu Li Xinpeng Liu Xiaoqian Wu Yizhuo Li Cewu Lu Shanghai Jiao Tong University yonglu_li@sjtu.edu.cn, xinpengliu0907@gmail.com, enlighten@sjtu.edu.cn liyizhuo@sjtu.edu.cn, lucewu@sjtu.edu.cn |
| Pseudocode | No | The paper describes the proposed Integration-Decomposition Network (IDN) and its components using text and network diagrams (e.g., Fig. 3), but it does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/Dirty Harry LYL/ HAKE-Action-Torch/tree/IDN-(Integrating-Decomposing-Network). |
| Open Datasets | Yes | We adopt the widely-used HICO-DET [4] and V-COCO [18]. |
| Dataset Splits | Yes | HICO-DET [4] consists of 47,776 images (38,118 for training and 9,658 for testing) and 600 HOI categories (80 COCO [29] objects and 117 verbs). V-COCO [18] contains 10,346 images (2,533 and 2,867 in train and validation sets, 4,946 in test set). |
| Hardware Specification | Yes | All experiments are conducted on one single NVIDIA Titan Xp GPU. |
| Software Dependencies | No | The paper mentions general software components like 'Res Net-50', 'Faster R-CNN', and optimizers like 'SGD', but it does not specify version numbers for any libraries, frameworks, or other software dependencies required to reproduce the experiments. |
| Experiment Setup | Yes | For HICO-DET [4], AE is pretrained for 4 epochs using SGD with a learning rate of 0.1, momentum of 0.9, while each batch contains 45 positive and 360 negative pairs. The whole IDN (AE and transformation modules) is first trained without inter-pair transformation (IPT) for 20 epochs using SGD with a learning rate of 2e-2, momentum of 0.9. Then we finetune IDN with IPT for 30 epochs using SGD, with a learning rate of 1e-3, momentum of 0.9. Each batch for the whole IDN contains 15 positive and 120 negative pairs. |