End-to-End Real-Time Vanishing Point Detection with Transformer
Authors: Xin Tong, Shi Peng, Yufei Guo, Xuhui Huang
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on synthetic and real-world datasets demonstrate that our method can be used in both natural and structural scenes, and is superior to other state-of-the-art methods on the balance of accuracy and efficiency. Extensive experiments show that our method can get better performance compared to other state-of-the-art methods on the balance of accuracy and efficiency. |
| Researcher Affiliation | Collaboration | Xin Tong*, Shi Peng, Yufei Guo, Xuhui Huang Intelligent Science & Technology Academy of CASIC xin tong@pku.edu.cn, pengshi1828@163.com, yfguo@pku.edu.cn, starhxh@126.com |
| Pseudocode | No | The paper describes the algorithm and network architecture in detail with textual explanations and diagrams (e.g., Figure 3, Figure 4). However, it does not include any formal pseudocode or algorithm blocks labeled as such. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | We conduct our experiments in four publicly available datasets including SU3 dataset (Zhou et al. 2019b), Scan Net dataset (Dai et al. 2017), Natural Scene dataset (Zhou, Farhat, and Wang 2017) and NYU dataset (Silberman et al. 2012). |
| Dataset Splits | Yes | Scan Net dataset is a real-world dataset and captures indoor scenes. It provides 189916 training images, 53193 validation images and 20942 test images. NYU dataset is split into 1000, 224 and 225 images for training, validating and testing the models following (Kluger et al. 2020; Lin et al. 2022). |
| Hardware Specification | Yes | VPTR runs at an inferring speed of 140 FPS on one NVIDIA 3090 card. |
| Software Dependencies | No | The paper states: "Our training and evaluation are implemented in Py Torch." While PyTorch is mentioned, no specific version number is provided, nor are any other software dependencies listed with version numbers. |
| Experiment Setup | Yes | In training, We use Adam W as the model optimizer and set weight decay as 10 4. We train the model for 60 epochs. The initial learning rates are set to 10 4 for backbone and 10 3 for others. Learning rates are reduced by a factor of 10 in epoch 30 and 45. We use a batch size of 16 and the size of the input images is set to 512 512. We divide the hemisphere with N = 256 anchors and use 256 queries in VPTR. λc, λp are set to 1 and 5, and λM is set to 1 in Manhattan scenes. |