Delving Deeper into Convolutional Networks for Learning Video Representations
Authors: Nicolas Ballas, Li Yao, Pal Chris, Aaron Courville
ICLR 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically validate our approach on both Human Action Recognition and Video Captioning tasks. |
| Researcher Affiliation | Academia | 1MILA, Universit e de Montr eal. 2 Ecole Polytechnique de Mont real. |
| Pseudocode | No | The paper provides mathematical equations for the GRU and GRU-RCN models but does not present them as structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access information (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described. |
| Open Datasets | Yes | We evaluate our approach on the UCF101 dataset Soomro et al. (2012). This dataset has 101 action classes spanning over 13320 You Tube videos clips. |
| Dataset Splits | Yes | We report results on the dataset UCF101 first split, as this is most commonly used split in the literature. To perform proper hyperparameter seach, we use the videos from the UCF-Thumos validation split Jiang et al. (2014) as the validation set. |
| Hardware Specification | No | The paper mentions 'computing support' from agencies like Compute Canada, but it does not provide specific details on the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'Theano' but does not provide a specific version number. It also mentions optimizers like 'Adam' and 'Adadelta' but not specific library versions used for their implementation. |
| Experiment Setup | Yes | At each iteration, a batch of 64 videos are sampled randomly from the the training set. To perform scale-augmentation, we randomly sample the cropping width and height from 256, 224, 192, 168. The temporal cropping size is set to 10. We then resize the cropped volume to 224x224x10. We use Adam Kingma & Ba (2014) with the gradient computed by the backpropagation algorithm. We perform early stopping and choose the parameters that maximize the log-probability of the validation set. |