Rethinking Disparity: A Depth Range Free Multi-View Stereo Based on Disparity
Authors: Qingsong Yan, Qiang Wang, Kaiyong Zhao, Bo Li, Xiaowen Chu, Fei Deng
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on DTUMVS and Tanks&Temple datasets show that Disp MVS is not sensitive to the depth range and achieves state-of-the-art results with lower GPU memory. In this section, we benchmark our Disp MVS on two public datasets and compare it with a set of existing methods. We also conduct ablation experiments to explore the effects of different settings of Disp MVS. |
| Researcher Affiliation | Collaboration | 1 Wuhan University, Wuhan, China 2 The Hong Kong University of Science and Technology, Hong Kong SAR, China 3 Harbin Institute of Technology (Shenzhen), Shenzhen, China 4 XGRIDS, Shenzhen, China 5 The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China |
| Pseudocode | No | The paper describes the method using text and mathematical equations but does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described, such as a specific repository link or an explicit code release statement. |
| Open Datasets | Yes | The DTUMVS (Aanæs et al. 2016) is an indoor dataset in a controlled environment containing 79 scenes for training, 22 for testing, and 18 for validation. The Blended MVS (Yao et al. 2020) is a large dataset captured from various outdoor scenes, with 106 scenes for training and the rest 7 scenes for testing. The Tanks&Temple (Knapitsch et al. 2017) is an outdoor multi-view stereo benchmark that contains 14 real-world scenes under complex conditions. |
| Dataset Splits | Yes | The DTUMVS (Aanæs et al. 2016) is an indoor dataset in a controlled environment containing 79 scenes for training, 22 for testing, and 18 for validation. |
| Hardware Specification | Yes | The training procedure is finished on two V100 with tc = 8, tf = 2 considering the GPU memory limitation. |
| Software Dependencies | No | The paper mentions "Py Torch (Paszke et al. 2019)" and "Adam (Kingma and Ba 2015)" but does not specify version numbers for these software components or any other libraries. |
| Experiment Setup | Yes | On the DTUMVS, we set the image resolution to 640 512 and N = 5. On the Blended MVS, we set the image resolution to 768 576 and N = 5. For all models, we apply the training strategy in Patchmatch Net (Wang et al. 2021) for better learning of the weight and use the Adam (Kingma and Ba 2015)( β1 = 0.9, β2 = 0.999 ) optimizer with an initial learning rate of 0.0002 that halves every four epochs for 16 epochs. In Disp MVS, we set ms = 4, mp = 9 at the coarse stage and ms = 2, mp = 5 at the fine stage. Disp MVS uses cr, csi with tc iterations at the coarse stage and fr, fsi with tf iterations at the fine stage. ...with tc = 8, tf = 2 considering the GPU memory limitation. |