reproducibilityindex.ai

Trajectory Convolution for Action Recognition

Authors: Yue Zhao, Yuanjun Xiong, Dahua Lin

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on Something-Something V1 and Kinetics datasets show that by explicitly taking into account the motion dynamics in the temporal operation, the proposed network obtains considerable improvements over the Separable-3D, a competitive baseline. To evaluate the effectiveness of our Trajectory Net, we conduct experiments on two benchmark datasets for action recognition: Something-Something V1 [8] and Kinetics [19].
Researcher Affiliation	Collaboration	Yue Zhao Department of Information Engineering The Chinese University of Hong Kong zy317@ie.cuhk.edu.hk Yuanjun Xiong Amazon Rekognition yuanjx@amazon.com Dahua Lin Department of Information Engineering The Chinese University of Hong Kong dhlin@ie.cuhk.edu.hk
Pseudocode	No	The paper describes algorithms using mathematical formulations and textual descriptions but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any concrete access information (specific repository link, explicit code release statement, or code in supplementary materials) for the source code of the methodology described.
Open Datasets	Yes	Something-Something V1 [8] is a large-scale crowd-sourced video dataset on human-object interaction. It contains 108,499 video clips in 174 classes. Kinetics [19] is a large-scale video dataset on human-centric activities sourced from You Tube.
Dataset Splits	Yes	The dataset is split into train, validation and test subset in the ratio of around 8:1:1. (Something-Something V1) and our version contains 240436, 19796 and 38685 clips in the training, validation and test subset, respectively. (Kinetics)
Hardware Specification	Yes	The network is tested on a workstation with Intel(R) Xeon(R) CPU (E5-2640 v3 @2.60GHz) and Nvidia Titan X GPU.
Software Dependencies	No	The paper mentions 'Open CV with CUDA' for the TV-L1 algorithm but does not specify version numbers for key software components or libraries.
Experiment Setup	Yes	The length of each input clip is 16 and the sampling step varies from 1 to 2. For Something-Something V1, the batch size is set to 64 while for Kinetics, the batch size is 128. On Kinetics, the network is trained from an initial learning rate of 0.01 and is reduced by 1/10 every 40 epochs. The whole training procedure takes 100 epochs.