UnFlow: Unsupervised Learning of Optical Flow With a Bidirectional Census Loss

Authors: Simon Meister, Junhwa Hur, Stefan Roth

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On the KITTI benchmarks, our unsupervised approach outperforms previous unsupervised deep networks by a large margin, and is even more accurate than similar supervised methods trained on synthetic datasets alone. By optionally fine-tuning on the KITTI training data, our method achieves competitive optical flow accuracy on the KITTI 2012 and 2015 benchmarks, thus in addition enabling generic pre-training of supervised networks for datasets with limited amounts of ground truth.
Researcher Affiliation Academia Simon Meister, Junhwa Hur, Stefan Roth Department of Computer Science TU Darmstadt, Germany
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Source code for training and evaluating our models is publicly available.
Open Datasets Yes Datasets for Training SYNTHIA. Ilg et al. (2017) showed that pre-training Flow Net on the synthetic Flying Chairs dataset before training on more complex and realistic datasets consistently improves the accuracy of their final networks. The SYNTHIA dataset consists of multiple views recorded from a vehicle driving through a virtual environment. We use left images from the front, back, left, and right views of all winter and summer driving sequences, which amount to about 37K image pairs. KITTI. The KITTI dataset (Geiger et al. 2013) consists of real road scenes captured by a car-mounted stereo camera rig. Cityscapes. The Cityscapes dataset (Cordts et al. 2016) contains real driving sequences annotated for semantic segmentation and instance segmentation, without optical flow ground truth.
Dataset Splits Yes For validation, we set aside 20% of the shuffled training pairs and fine-tune until the validation error increases, which generally occurs after about 70K iterations.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments, only general statements like 'on a GPU' would not suffice if present.
Software Dependencies No We implement all losses as well as the warping scheme with primitive Tensor Flow functions (Abadi and others 2015). However, a specific version number for TensorFlow or any other software is not mentioned.
Experiment Setup Yes As optimizer, we use Adam (Kingma and Ba 2015) with β1 = 0.9 and β2 = 0.999. Unsupervised SYNTHIA pre-training. We train for 300K iterations with a mini-batch size of 4 image pairs from the SYNTHIA data. We keep the initial learning rate of 1.0e 4 fixed for the first 100K iterations and then divide it by two after every 100K iterations. Unsupervised KITTI training. We train for 500K iterations with a mini-batch size of 4 image pairs from the raw KITTI data. We keep the initial learning rate of 1.0e 5 fixed for the first 100K iterations and then divide it by two after every 100K iterations.