Rethinking Disparity: A Depth Range Free Multi-View Stereo Based on Disparity

Authors: Qingsong Yan, Qiang Wang, Kaiyong Zhao, Bo Li, Xiaowen Chu, Fei Deng

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on DTUMVS and Tanks&Temple datasets show that Disp MVS is not sensitive to the depth range and achieves state-of-the-art results with lower GPU memory. In this section, we benchmark our Disp MVS on two public datasets and compare it with a set of existing methods. We also conduct ablation experiments to explore the effects of different settings of Disp MVS.
Researcher Affiliation Collaboration 1 Wuhan University, Wuhan, China 2 The Hong Kong University of Science and Technology, Hong Kong SAR, China 3 Harbin Institute of Technology (Shenzhen), Shenzhen, China 4 XGRIDS, Shenzhen, China 5 The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China
Pseudocode No The paper describes the method using text and mathematical equations but does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described, such as a specific repository link or an explicit code release statement.
Open Datasets Yes The DTUMVS (Aanæs et al. 2016) is an indoor dataset in a controlled environment containing 79 scenes for training, 22 for testing, and 18 for validation. The Blended MVS (Yao et al. 2020) is a large dataset captured from various outdoor scenes, with 106 scenes for training and the rest 7 scenes for testing. The Tanks&Temple (Knapitsch et al. 2017) is an outdoor multi-view stereo benchmark that contains 14 real-world scenes under complex conditions.
Dataset Splits Yes The DTUMVS (Aanæs et al. 2016) is an indoor dataset in a controlled environment containing 79 scenes for training, 22 for testing, and 18 for validation.
Hardware Specification Yes The training procedure is finished on two V100 with tc = 8, tf = 2 considering the GPU memory limitation.
Software Dependencies No The paper mentions "Py Torch (Paszke et al. 2019)" and "Adam (Kingma and Ba 2015)" but does not specify version numbers for these software components or any other libraries.
Experiment Setup Yes On the DTUMVS, we set the image resolution to 640 512 and N = 5. On the Blended MVS, we set the image resolution to 768 576 and N = 5. For all models, we apply the training strategy in Patchmatch Net (Wang et al. 2021) for better learning of the weight and use the Adam (Kingma and Ba 2015)( β1 = 0.9, β2 = 0.999 ) optimizer with an initial learning rate of 0.0002 that halves every four epochs for 16 epochs. In Disp MVS, we set ms = 4, mp = 9 at the coarse stage and ms = 2, mp = 5 at the fine stage. Disp MVS uses cr, csi with tc iterations at the coarse stage and fr, fsi with tf iterations at the fine stage. ...with tc = 8, tf = 2 considering the GPU memory limitation.