reproducibilityindex.ai

Video-based Human-Object Interaction Detection from Tubelet Tokens

Authors: Danyang Tu, Wei Sun, Xiongkuo Min, Guangtao Zhai, Wei Shen

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The effectiveness and efficiency of TUTOR are verified by extensive experiments. Results show our method outperforms existing works by large margins, with a relative m AP gain of 16.14% on Vid HOI and a 2 points gain on CAD-120 as well as a 4 speedup.
Researcher Affiliation	Academia	1Institute of Image Communication and Network Engineering, Shanghai Jiao Tong University 2Mo E Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University {danyangtu, sunguwei, minxiongkuo, zhaiguangtao, wei.shen}@sjtu.edu.cn
Pseudocode	No	The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	3. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] The code is provided in supplementary material.
Open Datasets	Yes	We conduct experiments on Vid HOI [5] and CAD-120 [22] benchmarks to evaluate the proposed methods by following the standard scheme. Vid HOI is a large-scale dataset for V-HOI detection, comprising 6,366 videos for training and 756 videos for validation. ...CAD-120 is a relatively smaller dataset that consists of 120 RGB-D videos.
Dataset Splits	Yes	Vid HOI is a large-scale dataset for V-HOI detection, comprising 6,366 videos for training and 756 videos for validation.
Hardware Specification	Yes	A batch size of 16 on 8 RTX-2080Ti GPUs, and learning rate lr = 2.5e 4 for Transformer and 1e 5 for FPN are used.
Software Dependencies	No	The paper mentions using an 'Adam W [31] optimizer' but does not specify versions for any other software dependencies such as PyTorch, TensorFlow, CUDA, or Python.
Experiment Setup	Yes	The dimension of HOI query is set to 256... The number of queries is set to 100 for Vid HOI and 50 for CAD-120... We employed an Adam W [31] optimizer for 150 epochs. A batch size of 16... learning rate lr = 2.5e 4 for Transformer and 1e 5 for FPN are used. The lr decayed by half at 50-th, 90-th and 120-th epoch, respectively. We use a lr = 10 6 to warm up the training for the first 5 epochs, and then go back to 2.5e 4 and continue training.