VPDETR: End-to-End Vanishing Point DEtection TRansformers
Authors: Taiyan Chen, Xianghua Ying, Jinfa Yang, Ruibin Wang, Ruohao Guo, Bowei Xing, Ji Shi
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that VPDETR achieves competitive performance compared to state-of-the-art methods, without requiring post-processing. Our experiments demonstrate that VPDETR achieves a good balance of accuracy and inference speed, comparable to state-of-the-art models. And we provide ablation experiments to showcase the effectiveness of the proposed components. |
| Researcher Affiliation | Academia | National Key Laboratory of General Artificial Intelligence, School of Intelligence Science and Technology, Peking University chenty@stu.pku.edu.cn, xhying@pku.edu.cn |
| Pseudocode | No | The paper describes the model and its components using textual descriptions and mathematical equations, but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | Since our method is based on set predictions, it can be applied both to datasets that following the Manhattan world assumption such as SU3 (Zhou et al. 2019b), Scan Net (Dai et al. 2017), YUD (Denis, Elder, and Estrada 2008), and SVVP (Tong et al. 2022), and to non-Manhattan world dataset such as NYU Depth (Silberman et al. 2012). |
| Dataset Splits | Yes | The Scan Net dataset comprises 189,916 RGB-D images of real-world indoor scenes for training and 53,193 for validation. |
| Hardware Specification | Yes | We implement our model on Nvidia RTX2080Ti for a fair comparison of inference speed. |
| Software Dependencies | No | The paper mentions using 'Res Net-50' (He et al. 2016) and 'Adam W' (Loshchilov and Hutter 2017) but does not specify general software dependencies like frameworks (e.g., PyTorch, TensorFlow) or their version numbers. |
| Experiment Setup | Yes | The number of queries is set as 20. By default, models are trained for 220 epochs and the learning rate is decayed at the 200-th epoch by a factor of 0.1. We trained our model using Adam W (Loshchilov and Hutter 2017) with base learning rate of 5 10 5, and weight decay of 10 4. |