reproducibilityindex.ai

Channel Attention Is All You Need for Video Frame Interpolation

Authors: Myungsub Choi, Heewon Kim, Bohyung Han, Ning Xu, Kyoung Mu Lee10663-10671

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We construct a comprehensive evaluation benchmark and demonstrate that the proposed approach achieves outstanding performance compared to the existing models with a component for optical ﬂow computation.
Researcher Affiliation	Collaboration	Myungsub Choi,1 Heewon Kim,1 Bohyung Han,1 Ning Xu,2 Kyoung Mu Lee1 1Computer Vision Lab. & ASRI, Seoul National University, 2Amazon Go {cms6539, ghimhw, bhhan, kyoungmu}@snu.ac.kr, ninxu@amazon.com
Pseudocode	No	No pseudocode or clearly labeled algorithm block was found in the paper.
Open Source Code	Yes	The source code for our framework is made public along with the pretrained models to facilitate reproduction.2 https://github.com/myungsub/CAIN
Open Datasets	Yes	We evaluate our model on three benchmark datasets commonly used in the recent works (Jiang et al. 2018; Liu et al. 2017; Niklaus and Liu 2018; Niklaus, Mai, and Liu 2017b; Xue et al. 2018): Middlebury optical ﬂow (Baker et al. 2010), UCF101 (Soomro, Zamir, and Shah 2012), and Vimeo90K (Xue et al. 2018).
Dataset Splits	Yes	We use the training split of Vimeo90K (Xue et al. 2018) dataset for training... The initial learning rate is 0.0001, which is reduced by a factor of 2 whenever the validation loss stops decreasing for more than 5 epochs. Our evaluation benchmark has four different settings Easy, Medium, Hard, and Extreme depending on the temporal gap between two input frames.
Hardware Specification	Yes	A full training of our network takes about 4 days on a single Titan Xp GPU.
Software Dependencies	No	Our algorithm is implemented in Py Torch. The version number for PyTorch is not specified.
Experiment Setup	Yes	We use the training split of Vimeo90K (Xue et al. 2018) dataset for training, where our model is optimized by Adam (Kingma and Ba 2014) for 200 epochs (approximately 320K iterations); training is based on 256 256 patches and the batch size is 32. Random vertical and horizontal ﬂipping along with random temporal order swapping between two input frames are adopted for data augmentation. The initial learning rate is 0.0001, which is reduced by a factor of 2 whenever the validation loss stops decreasing for more than 5 epochs. We clip the gradient norm to be less than 0.1, which handles the gradient explosion issue.