Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

SAMFlow: Eliminating Any Fragmentation in Optical Flow with Segment Anything Model

Authors: Shili Zhou, Ruian He, Weimin Tan, Bo Yan

AAAI 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our proposed SAMFlow model reaches 0.86/2.10 clean/final EPE and 3.55/12.32 EPE/F1all on Sintel and KITTI-15 training set, surpassing Flowformer by 8.5%/9.9% and 13.2%/16.3%. Furthermore, our model achieves state-of-the-art performance on the Sintel and KITTI-15 benchmarks, ranking #1 among all two-frame methods on Sintel clean pass.
Researcher Affiliation Academia School of Computer Science, Shanghai Key Laboratory of Intelligent Information Processing, Fudan University EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode No Overall, this module can be represented by Formula 4, 5, 6 and 7.
Open Source Code No No explicit statement or link for open-source code was found in the paper.
Open Datasets Yes Our proposed SAMFlow model reaches 0.86/2.10 clean/final EPE and 3.55/12.32 EPE/F1all on Sintel and KITTI-15 training set, surpassing Flowformer by 8.5%/9.9% and 13.2%/16.3%. With the above designs, our proposed SAMFlow achieves remarkable performance, reaching 0.86/2.10 clean/final EPE on Sintel (Butler et al. 2012) training set and 3.55/12.32 EPE/F1-all on KITTI-15 (Geiger et al. 2013) training set.
Dataset Splits No Training Settings We follow the setup of previous work (Huang et al. 2022a) and divide the training into two stages: C+T-Stage and C+T+S+K+H-stage. To speed up training, we skip the stage of training on the Chairs dataset by using Flow Former-things checkpoint as initialization, and the SAM encoder is kept frozen during training.
Hardware Specification No Figure 7: Runtime and accuracy comparison between Flowformer, Flow Former++, and our models with different SAM encoders, including SAM-B, SAM-H, and Mobile SAM (MSAM). The x-axis is the average time of 100 runs of 384 × 1024 inputs, and the y-axis is the f1 score on KITTI.
Software Dependencies No No specific software dependencies with version numbers were mentioned in the paper.
Experiment Setup No Training Settings We follow the setup of previous work (Huang et al. 2022a) and divide the training into two stages: C+T-Stage and C+T+S+K+H-stage. To speed up training, we skip the stage of training on the Chairs dataset by using Flow Former-things checkpoint as initialization, and the SAM encoder is kept frozen during training. Test Settings For testing, we adopt the tiling strategy (Jaegle et al. 2021) to bridge the resolution gap between training and testing data.