Self-Supervised Motion Magnification by Backpropagating Through Optical Flow
Authors: Zhaoying Pan, Daniel Geng, Andrew Owens
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our model both quantitatively through experiments with real and synthetic data, and qualitatively on real videos. |
| Researcher Affiliation | Academia | Zhaoying Pan Daniel Geng Andrew Owens University of Michigan |
| Pseudocode | Yes | Pseudocode for our method for training and inference can be found in Algorithm 1 and Algorithm 2 respectively. |
| Open Source Code | No | The paper provides a project webpage link (https://dangeng.github.io/Flow Mag) but does not include an explicit statement about the release of source code for the methodology, nor a direct link to a code repository. |
| Open Datasets | Yes | To this end we curate a dataset containing 145k unlabeled frame pairs from several existing datasets, including You Tube-VOS-2019 [68], DAVIS [42], Vimeo-90k [66], Tracking Any Object (TAO) [8], and Unidentified Video Objects (UVO) [60]. |
| Dataset Splits | Yes | The train set is collected by sampling from the train sets of the above datasets, with the exception of DAVIS, where we use the trainval split due to its smaller size. To construct a real-world test set, a total of 50 frame pairs are randomly collected from each dataset s test set, except for the TAO dataset, in which the validation set is used. After sampling and filtering, we obtain a training dataset of 145k frame pairs. Detailed information regarding the number of sampled raw pairs and the number of selected pairs can be found in Table A1 in our paper. |
| Hardware Specification | Yes | model training is performed using a batch size of 40 with 4 A40 GPUs and an image size of 512 512. |
| Software Dependencies | No | The paper mentions "Py Torch-like style" for pseudocode and uses specific models like "ARFlow" and "RAFT", but it does not provide specific version numbers for software libraries or frameworks used in the implementation. |
| Experiment Setup | Yes | Training parameters. The parameter λcolor is set to 10, while the magnification factors α range from 1 to 16, sampled geometrically. The learning rate is 3 10 4, and model training is performed using a batch size of 40 with 4 A40 GPUs and an image size of 512 512. In terms of data augmentation, a random area is initially cropped with a scale within the range of (0.7, 1.0). The cropped area is then resized to dimensions of 512 512. Furthermore, the image is subjected to random horizontal or vertical flipping with a probability of 0.5, as well as random rotation within the range of ( 15 , 15 ). Finally, strong color jittering is applied to the frames. These transformations are applied identically to both frames. |