Self-Supervised Motion Magnification by Backpropagating Through Optical Flow

Authors: Zhaoying Pan, Daniel Geng, Andrew Owens

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our model both quantitatively through experiments with real and synthetic data, and qualitatively on real videos.
Researcher Affiliation Academia Zhaoying Pan Daniel Geng Andrew Owens University of Michigan
Pseudocode Yes Pseudocode for our method for training and inference can be found in Algorithm 1 and Algorithm 2 respectively.
Open Source Code No The paper provides a project webpage link (https://dangeng.github.io/Flow Mag) but does not include an explicit statement about the release of source code for the methodology, nor a direct link to a code repository.
Open Datasets Yes To this end we curate a dataset containing 145k unlabeled frame pairs from several existing datasets, including You Tube-VOS-2019 [68], DAVIS [42], Vimeo-90k [66], Tracking Any Object (TAO) [8], and Unidentified Video Objects (UVO) [60].
Dataset Splits Yes The train set is collected by sampling from the train sets of the above datasets, with the exception of DAVIS, where we use the trainval split due to its smaller size. To construct a real-world test set, a total of 50 frame pairs are randomly collected from each dataset s test set, except for the TAO dataset, in which the validation set is used. After sampling and filtering, we obtain a training dataset of 145k frame pairs. Detailed information regarding the number of sampled raw pairs and the number of selected pairs can be found in Table A1 in our paper.
Hardware Specification Yes model training is performed using a batch size of 40 with 4 A40 GPUs and an image size of 512 512.
Software Dependencies No The paper mentions "Py Torch-like style" for pseudocode and uses specific models like "ARFlow" and "RAFT", but it does not provide specific version numbers for software libraries or frameworks used in the implementation.
Experiment Setup Yes Training parameters. The parameter λcolor is set to 10, while the magnification factors α range from 1 to 16, sampled geometrically. The learning rate is 3 10 4, and model training is performed using a batch size of 40 with 4 A40 GPUs and an image size of 512 512. In terms of data augmentation, a random area is initially cropped with a scale within the range of (0.7, 1.0). The cropped area is then resized to dimensions of 512 512. Furthermore, the image is subjected to random horizontal or vertical flipping with a probability of 0.5, as well as random rotation within the range of ( 15 , 15 ). Finally, strong color jittering is applied to the frames. These transformations are applied identically to both frames.