Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

PPMStereo: Pick-and-Play Memory Construction for Consistent Dynamic Stereo Matching

Authors: WANG Yun, Junjie Hu, Qiaole Dong, Yongjian Zhang, Yanwei Fu, Tin Lun Lam, Dapeng Wu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments validate the effectiveness of PPMStereo, demonstrating state-of-the-art performance in both accuracy and temporal consistency. ... Extensive experiments show that our method achieves state-of-the-art temporal consistency and accuracy. Specifically, on both the clean and final pass of the Sintel [6] dataset, our model achieves a temporal end-of-point error (TEPE) of 0.62 and 1.11 pixels, with 3-pixel error rates of 5.19% and 7.64%, respectively. ... 4 Experiments
Researcher Affiliation	Academia	1City University of Hong Kong, 2The Chinese University of Hong Kong, Shenzhen 3 Fudan University,4 Shenzhen Campus, Sun Yat-sen University
Pseudocode	Yes	Algorithm 1 Pseudo code of Pick-and-Play Memory
Open Source Code	Yes	Codes are available at https://github.com/cocowy1/PPMStereo.
Open Datasets	Yes	For training and evaluation, we employ three synthetic and one real-world stereo video dataset, all featuring dynamic scenes: Scene Flow (SF) [34] comprising Flying Things3D, Driving, and Monkaa, with Flying Things3D featuring moving 3D objects against varied backgrounds. Dynamic Replica (DR) [24], a synthetic indoor dataset with non-rigid objects such as people and animals. Sintel [6], a synthetic movie dataset available in clean and final passes. South Kensington (SV) [23], a real-world stereo dataset without ground truth data, capturing daily scenarios.
Dataset Splits	Yes	Following prior work [24, 22], we train on synthetic datasets (SF and DR + SF) and evaluate the performance on Sintel, DR, and SV. ... The dataset splits were used and shared in previous work.
Hardware Specification	Yes	We implement PPMStereo in Py Torch, training on 8 A100 GPUs (batch size = 2) using 320 512 crops from 5-frame sequences, evaluated at full resolution with 20-frame sequences.
Software Dependencies	No	We implement PPMStereo in Py Torch, training on 8 A100 GPUs (batch size = 2) using 320 512 crops from 5-frame sequences, evaluated at full resolution with 20-frame sequences. We use Adam W (lr = 0.0003) with one-cycle scheduling, training for 180k iterations ( 4.5 days). Data augmentation follows Dynamic Stereo [24], including random crops and saturation shifts. For efficient memory readout, we employ Flash Attention [13]. Following prior works [22, 24], we set the number of evaluation iterations N to 20, while setting N = 10 during training.
Experiment Setup	Yes	We implement PPMStereo in Py Torch, training on 8 A100 GPUs (batch size = 2) using 320 512 crops from 5-frame sequences, evaluated at full resolution with 20-frame sequences. We use Adam W (lr = 0.0003) with one-cycle scheduling, training for 180k iterations ( 4.5 days). Data augmentation follows Dynamic Stereo [24], including random crops and saturation shifts. ... Following prior works [22, 24], we set the number of evaluation iterations N to 20, while setting N = 10 during training. ... where n denotes the number of iterations and γ is a decay factor set as 0.9. ... where dt and ˆdt represent the predicted and ground-truth disparities for the t-th frame, respectively, and σ is a hyper-parameter empirically set to 5.