Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning End-to-end Video Classification with Rank-Pooling

Authors: Basura Fernando, Stephen Gould

ICML 2016 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate our approach on action and activity recognition tasks. We conduct experiments on action and activity recognition tasks in video using two real-world datasets, and compare our approach against some strong baseline methods.
Researcher Affiliation Academia Basura Fernando EMAIL Research School of Engineering, The Australian National University, ACT 2601, Australia Stephen Gould EMAIL Research School of Computer Science, The Australian National University, ACT 2601, Australia
Pseudocode No The paper does not contain a pseudocode block or a clearly labeled algorithm.
Open Source Code No The paper mentions using 'publicly available code (Fernando et al., 2015)' for a baseline method, but it does not state that the source code for the methodology described in *this* paper is open-source or provide a link.
Open Datasets Yes First, we use UCF-sports dataset (Rodriguez et al., 2008) for the task of action classification. Second, we use the Hollywood2 dataset (Laptev et al., 2008) for the task of activity recognition.
Dataset Splits Yes We use provided train-test splits for training and testing. It has 1,707 videos in total with a pre-defined split of 823 training videos and 884 test videos.
Hardware Specification Yes Using the full gradient optimization is ten times slower than the approximate method, resulting in pro-cessing videos at 5 frames per second versus 50 frames per second (for the approximate method) during training on a Titan-X GPU.
Software Dependencies No The paper mentions software like 'Caffe reference model' and 'Mat Conv Net', but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes We initialize the network with the Caffe reference model and use a variable learning rate starting from 0.01 down to 0.0001 over 60 epochs. We also use a weight decay of 0.0005 on an L2-regularizer over the model parameters.