Adaptive Feature Abstraction for Translating Video to Language

Authors: Yunchen Pu, Martin Renqiang Min, Zhe Gan, Lawrence Carin

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate quantitatively the effectiveness of our proposed adaptive spatiotemporal feature abstraction for translating videos to sentences with rich semantic structures.
Researcher Affiliation Collaboration Yunchen Pu Department of Electrical and Computer Engineering Duke University yunchen.pu@duke.edu Martin Renqiang Min Machine Learning Group NEC Laboratories America renqiang@nec-labs.com Zhe Gan Department of Electrical and Computer Engineering Duke University zhe.gan@duke.edu Lawrence Carin Department of Electrical and Computer Engineering Duke University lcarin@duke.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures).
Open Source Code No The paper does not provide concrete access to source code for the methodology described in this paper.
Open Datasets Yes We present results on Microsoft Research Video Description Corpus (You Tube2Text) (Chen & Dolan, 2011).
Dataset Splits Yes For fair comparison, we used the same splits as provided in Yu et al. (2016), with 1200 videos for training, 100 videos for validation, and 670 videos for testing.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment.
Experiment Setup No The paper describes data preprocessing steps (e.g., 'all videos are resized to 112 × 112 spatially, with 2 frames per second') and feature extraction methods, but does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs, optimizer settings) or explicit training schedules in the main text.