Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Neural-Logic Human-Object Interaction Detection

Authors: Liulei Li, Jianan Wei, Wenguan Wang, Yi Yang

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate LOGICHOI on V-COCO and HICO-DET under both normal and zero-shot setups, achieving significant improvements over existing methods. Specifically, it achieves a mean m AP score of 65.0% across two scenarios. For in-depth analysis, we perform a series of ablative studies on HICO-DET[54] test.
Researcher Affiliation	Academia	Liulei Li1, Jianan Wei2, Wenguan Wang2 , Yi Yang2 1Re LER, AAII, University of Technology Sydney 2CCAI, Zhejiang University
Pseudocode	No	The paper describes mathematical formulations and architectural components but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	https://github.com/weijianan1/Logic HOI
Open Datasets	Yes	We conduct experiments on two widely-used HOI detection benchmarks: V-COCO [53] is a carefully curated subset of MS-COCO [110] which contains 10,346 images (5,400 for training and 4,946 for testing). HICO-DET[54] consists of 47,776 images in total, with 38,118 for training and 9,658 designated for testing.
Dataset Splits	No	The paper specifies training and testing splits for V-COCO and HICO-DET, but does not explicitly mention a separate validation dataset split or its size/percentage for reproduction.
Hardware Specification	Yes	on 4 Ge Force RTX 3090 GPUs.
Software Dependencies	No	The paper mentions software components like 'DETR', 'CLIP', and 'Adam optimizer' but does not provide specific version numbers for these or any other libraries or frameworks.
Experiment Setup	Yes	we conducted training for 90 epochs using the Adam optimizer with a batch size of 16 and base learning rate 1e 4, on 4 Ge Force RTX 3090 GPUs. The learning rate is scheduled following a step policy, decayed by a factor of 0.1 at the 60th epoch. The number of human, object, action queries Nh, No, Na is set to 32 for efficiency, and the hidden sizes of all the modules are set to D=768. Here α is set to 0.2 empirically. Following the convention[11,16,17,20,66], we set K to 100.