reproducibilityindex.ai

Improving Human-Object Interaction Detection via Phrase Learning and Label Composition

Authors: Zhimin Li, Cheng Zou, Yu Zhao, Boxun Li, Sheng Zhong1509-1517

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments are conducted to prove the effectiveness of the proposed Phrase HOI, which achieves significant improvement over the baseline and surpasses previous state-of-the-art methods on Full and Non Rare on the challenging HICO-DET benchmark.
Researcher Affiliation	Collaboration	1 National Key Laboratory of Science and Technology on Multispectral Information Processing, School of Artiﬁcial Intelligence and Automation, Huazhong University of Science and Technology 2 Megvii Technology
Pseudocode	No	The paper describes its methods in narrative text and with architectural diagrams, but it does not include any explicit pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statements about releasing source code or links to a code repository for its methodology.
Open Datasets	Yes	Experiments are conducted on V-COCO (Gupta and Malik 2015) and HICO-DET (Chao et al. 2018) benchmark.
Dataset Splits	No	The paper specifies training and test set sizes for HICO-DET ('38,118 images in training set and 9,658 in test set') but does not explicitly mention a separate validation set size or split.
Hardware Specification	No	The paper mentions using ResNet-50 and ResNet-101 backbones, but it does not specify the hardware (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies	No	The paper mentions using pre-trained models like word2vec and GPT1 and initializing weights with DETR, but it does not provide specific version numbers for these or other software dependencies (e.g., Python, PyTorch, CUDA versions) needed for reproducibility.
Experiment Setup	Yes	The model is trained with AdamW, and the learning rate is set to 1e-4 except that the learning rate for backbone is set to 1e-5. The batch size for ResNet-50 and ResNet-101 are set to 64 and 32 respectively...All the models are trained for 200 epochs with once learning rate decay at epoch 150. The hyper-parameter α in Eq. 1 implies the loss weight of phrase and is set to 0.1, the hyper-parameter β in Eq. 2 implies the loss weight of triplet loss and is set to 10, the hyper-parameter m in Eq. 3 is set to 0.5 in experiments.