reproducibilityindex.ai

SAMFlow: Eliminating Any Fragmentation in Optical Flow with Segment Anything Model

Authors: Shili Zhou, Ruian He, Weimin Tan, Bo Yan

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our proposed SAMFlow model reaches 0.86/2.10 clean/final EPE and 3.55/12.32 EPE/F1all on Sintel and KITTI-15 training set, surpassing Flowformer by 8.5%/9.9% and 13.2%/16.3%. Furthermore, our model achieves state-of-the-art performance on the Sintel and KITTI-15 benchmarks, ranking #1 among all two-frame methods on Sintel clean pass.
Researcher Affiliation	Academia	School of Computer Science, Shanghai Key Laboratory of Intelligent Information Processing, Fudan University slzhou19@fudan.edu.cn, rahe16@fudan.edu.cn, wmtan@fudan.edu.cn, byan@fudan.edu.cn
Pseudocode	No	Overall, this module can be represented by Formula 4, 5, 6 and 7.
Open Source Code	No	No explicit statement or link for open-source code was found in the paper.
Open Datasets	Yes	Our proposed SAMFlow model reaches 0.86/2.10 clean/final EPE and 3.55/12.32 EPE/F1all on Sintel and KITTI-15 training set, surpassing Flowformer by 8.5%/9.9% and 13.2%/16.3%. With the above designs, our proposed SAMFlow achieves remarkable performance, reaching 0.86/2.10 clean/final EPE on Sintel (Butler et al. 2012) training set and 3.55/12.32 EPE/F1-all on KITTI-15 (Geiger et al. 2013) training set.
Dataset Splits	No	Training Settings We follow the setup of previous work (Huang et al. 2022a) and divide the training into two stages: C+T-Stage and C+T+S+K+H-stage. To speed up training, we skip the stage of training on the Chairs dataset by using Flow Former-things checkpoint as initialization, and the SAM encoder is kept frozen during training.
Hardware Specification	No	Figure 7: Runtime and accuracy comparison between Flowformer, Flow Former++, and our models with different SAM encoders, including SAM-B, SAM-H, and Mobile SAM (MSAM). The x-axis is the average time of 100 runs of 384 × 1024 inputs, and the y-axis is the f1 score on KITTI.
Software Dependencies	No	No specific software dependencies with version numbers were mentioned in the paper.
Experiment Setup	No	Training Settings We follow the setup of previous work (Huang et al. 2022a) and divide the training into two stages: C+T-Stage and C+T+S+K+H-stage. To speed up training, we skip the stage of training on the Chairs dataset by using Flow Former-things checkpoint as initialization, and the SAM encoder is kept frozen during training. Test Settings For testing, we adopt the tiling strategy (Jaegle et al. 2021) to bridge the resolution gap between training and testing data.