reproducibilityindex.ai

FASTER Recurrent Networks for Efficient Video Classification

Authors: Linchao Zhu, Du Tran, Laura Sevilla-Lara, Yi Yang, Matt Feiszli, Heng Wang13098-13105

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that our FASTER framework has signiﬁcantly better accuracy/FLOPs trade-offs, achieving the state-of-the-art accuracy with 10 less FLOPs.
Researcher Affiliation	Collaboration	Facebook AI Re LER, University of Technology Sydney University of Edinburgh {linchao.zhu, yi.yang}@uts.edu.au, lsevilla@ed.ac.uk, {trandu, mdf, hengwang}@fb.com
Pseudocode	No	The paper describes the architecture and equations for FAST-GRU, but does not provide a formally structured pseudocode or algorithm block.
Open Source Code	No	The paper mentions that the R(2+1)D backbone code is available at 'https://github.com/facebookresearch/VMZ', but does not state that the code for their proposed FASTER framework is open-source or publicly available.
Open Datasets	Yes	We choose the Kinetics (Kay et al. 2017) dataset as the major testbed for FASTER. ... We also report results on UCF-101 (Soomro, Zamir, and Shah 2012) and HMDB-51 (Kuehne et al. 2011).
Dataset Splits	No	The paper states that for Kinetics, 'We report top-1 accuracy on the validation set as labels on the testing set is not public available.' and for UCF-101 and HMDB-51, 'we use Kinetics for pre-training and report mean accuracy on three testing splits.' While it implies the existence of training, validation, and testing sets, it does not provide specific percentages, counts, or explicit details of the train/validation/test splits needed for reproduction.
Hardware Specification	Yes	We measure the runtime speed of different methods on a TITAN X GPU with an Intel i7 CPU.
Software Dependencies	No	The paper mentions various models and techniques (e.g., CNNs, RNNs, GRU, LSTM, ResNet, SoftMax loss, Batch Normalization, ReLU) and a learning rate schedule, but does not provide specific software dependencies with version numbers (e.g., PyTorch 1.x, TensorFlow 2.x).
Experiment Setup	Yes	Setups for clip-level backbones. We mostly follow the procedure in (Tran et al. 2018) to train the clip-level backbones except the following two changes. First, we scale the input video whose shorter side is randomly sampled in [256, 320] pixels, following (Wang et al. 2018). Second, we adopt the cosine learning rate schedule (Loshchilov and Hutter 2016). During training, we randomly sample L consecutive frames from a given video. ... We ﬁx the total number of frames processed to be 256, i.e., N L = 256.