reproducibilityindex.ai

V4D: 4D Convolutional Neural Networks for Video-level Representation Learning

Authors: Shiwen Zhang, Sheng Guo, Weilin Huang, Matthew R. Scott, Limin Wang

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments are conducted on three video recognition benchmarks, where V4D achieves excellent results, surpassing recent 3D CNNs by a large margin.
Researcher Affiliation	Collaboration	Shiwen Zhang, Sheng Guo, Weilin Huang & Matthew R. Scott Malong Technologies, Shenzhen, China Shenzhen Malong Artiﬁcial Intelligence Research Center, Shenzhen, China {shizhang,sheng,whuang,mscott}@malong.com Limin Wang State Key Laboratory for Novel Software Technology, Nanjing University, China lmwang@nju.edu.cn
Pseudocode	Yes	Algorithm 1: V4D Inference. Networks :The structure of networks is divided into two sub-networks by the ﬁrst 4D Block, namely N3D and N4D. Input :Uinfer action units from a holistic video: {A1, A2, ..., AUinfer}. Output :The video-level prediction.
Open Source Code	No	The paper does not provide any explicit statements about the release of source code, nor does it include links to a code repository for the described methodology.
Open Datasets	Yes	We conduct experiments on three standard benchmarks: Mini-Kinetics (Xie et al., 2018), Kinetics-400 (Carreira & Zisserman, 2017), and Something-Something-v1 (Goyal et al., 2017).
Dataset Splits	Yes	Our version of Kinetics-400 contains 240,436 and 19,796 videos in the training subset and validation subset, respectively. Our version of Mini-kinetics contains 78,422 videos for training, and 4,994 videos for validation. Each video has around 300 frames. Something Something-v1 contains 108,499 videos totally, with 86,017 for training, 11,522 for validation, and 10,960 for testing.
Hardware Specification	No	The paper does not provide specific details about the hardware used for experiments, such as GPU models, CPU specifications, or memory.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or specific library versions).
Experiment Setup	Yes	We utilize a SGD optimizer with an initial learning rate of 0.01, weight decay is set to 10 5 with a momentum of 0.9. The learning rate drops by 10 at epoch 35, 60, 80, and the model is trained for 100 epochs in total.