Unsupervised Learning of Scene Flow Estimation Fusing with Local Rigidity

Authors: Liang Liu, Guangyao Zhai, Wenlong Ye, Yong Liu

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through the evaluation on the KITTI benchmark, we show that the proposed framework achieves state-of-the-art results amongst unsupervised methods.
Researcher Affiliation Academia Liang Liu , Guangyao Zhai , Wenlong Ye and Yong Liu Institute of Cyber-Systems and Control, Zhejiang University {leonliuz, zgyddzyx, wenlongye}@zju.edu.cn, yongliu@iipc.zju.edu.cn
Pseudocode No The paper describes its methods and processes in text, but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Our models and code are available at https://github.com/lliuz/unrigidflow.
Open Datasets Yes We evaluate our method on the KITTI benchmark suite. More specifically, scene flow is evaluated on the KITTI 2015 scene flow benchmark... We adopt the images from the KITTI raw dataset [Geiger et al., 2013] for unsupervised training depth and flow models without using any ground truth of depth and optical flow.
Dataset Splits Yes Since the KITTI raw dataset contains some samples from the validation set, we filtered all the scenes that appeared in the validation before training. The training in our method contains two stages. In the second stage, we fine-tune the models with the best validation accuracy in the first stage.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies No The paper mentions 'Py Torch [Paszke et al., 2017]' as an implementation framework, but it does not specify its version or list other software dependencies with their respective version numbers.
Experiment Setup Yes All models are trained with the Adam optimizer [Kingma and Ba, 2015] with β1 = 0.9, β2 = 0.99, batch size of 4. We resize the images to 256 832 and data augmentation including random flip and time swap is used. ... The initial learning rate is 10 4, and divided by 2 every 100k iterations, finishing at 300k iterations. ... The loss weights are change to λr = 1, λc = 0.1. The learning rate is 10 5 and finishing at 30k iterations.