A Global Depth-Range-Free Multi-View Stereo Transformer Network with Pose Embedding
Authors: Yitong Dong, Yijin Li, Zhaoyang Huang, Weikang Bian, Jingbo Liu, Hujun Bao, Zhaopeng Cui, Hongsheng Li, Guofeng Zhang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive results on the DTU dataset and Tanks&Temple benchmark demonstrate the effectiveness of our method. |
| Researcher Affiliation | Collaboration | 1State Key Lab of CAD&CG, Zhejiang University 2CUHK MMLab |
| Pseudocode | No | The paper describes the method using text and mathematical equations, but does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | We plan to release the code and detailed results later. |
| Open Datasets | Yes | DTU dataset [23] is an indoor multi-view stereo dataset... Blended MVS dataset [66] is a large-scale outdoor multi-view stereo dataset... Tanks and Temples [24] is a public multi-view stereo benchmark |
| Dataset Splits | Yes | Following MVSNet [8], we partitioned the DTU dataset into 79 training sets, 18 validation sets, and 22 evaluation sets. |
| Hardware Specification | Yes | The training procedure is finished on two A100 |
| Software Dependencies | No | The paper mentions 'Implemented by PyTorch [67]' but does not provide specific version numbers for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | On the DTU dataset, we set the image resolution as 640 512 and the number of input images as 5 for the training phase... For all models, we use the Adam W optimizer with an initial learning rate of 0.0002 that halves every four epochs for 16 epochs. |