DIRV: Dense Interaction Region Voting for End-to-End Human-Object Interaction Detection

Authors: Hao-Shu Fang, Yichen Xie, Dian Shao, Cewu Lu1291-1299

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on two popular benchmarks: V-COCO and HICO-DET show that our approach outperforms existing state-of-the-arts by a large margin with the highest inference speed and lightest network architecture.
Researcher Affiliation Academia 1 Shanghai Jiao Tong University 2 The Chinese University of Hong Kong
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Our code is publicly available at www.github.com/MVIG-SJTU/DIRV.
Open Datasets Yes We evaluate our method on two popular datasets: V-COCO (Gupta and Malik 2015) and HICO-DET (Chao et al. 2015).
Dataset Splits Yes V-COCO dataset is a subset of COCO (Lin et al. 2014) with extra interaction labels. It contains 10,346 images (2,533 for training, 2867 for validation and 4,946 for testing).
Hardware Specification Yes All experiments are carried out on NVIDIA RTX2080Ti GPUs.
Software Dependencies No The paper mentions software like Efficient Det-d3 and Adam optimizer but does not provide specific version numbers for key libraries or frameworks (e.g., Python, PyTorch/TensorFlow versions).
Experiment Setup Yes We set the learning rate as 1e-4 with a batch size of 32.