OPUS: Occupancy Prediction Using a Sparse Set
Authors: JiaBao Wang, Zhaojiang Liu, Qiang Meng, Liujiang Yan, Ke Wang, JIE YANG, Wei Liu, Qibin Hou, Ming-Ming Cheng
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, compared with current state-of-the-art methods, our lightest model achieves superior Ray Io U on the Occ3D-nu Scenes dataset at near 2 FPS, while our heaviest model surpasses previous best results by 6.1 Ray Io U. |
| Researcher Affiliation | Collaboration | Jiabao Wang1 , Zhaojiang Liu2 , Qiang Meng3, Liujiang Yan3, Ke Wang3, Jie Yang2, Wei Liu2, Qibin Hou1,4 , Ming-Ming Cheng1,4 1VCIP, College of Computer Science, Nankai University 2Shanghai Jiao Tong University 3Kargo Bot Inc. 4NKIARI, Shenzhen Futian |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | https://github.com/jbwang1997/OPUS |
| Open Datasets | Yes | All models are evaluated on the Occ3D-nu Scenes [38] dataset, which provides occupancy labels for 18 classes (1 free class and 17 semantic classes) on the large-scale nu Scenes [2] benchmark. |
| Dataset Splits | Yes | Out of the 1,000 labeled driving scenes, 750/150/150 are used for training/validation/testing, respectively. |
| Hardware Specification | Yes | All models are trained on 8 nvidia 4090 GPUs with a batch size of 8 using the Adam W [26] optimizer. The learning rate warms up to 2e 4 in the first 500 iterations and then decays with a Cosine Annealing [25] scheme. Unless otherwise stated, models in main results are trained for 100 epochs and those in the ablation study are trained for 12 epochs. |
| Software Dependencies | No | The paper mentions using a 'Res Net50 [7] backbone', 'Adam W [26] optimizer', and 'Cosine Annealing [25] scheme', as well as 'focal loss [17]'. However, it does not provide specific version numbers for these software components or the underlying deep learning framework. |
| Experiment Setup | Yes | Following previous works [21, 16, 8], we resize images to 704 256 and extract features using a Res Net50 [7] backbone. We denote a series of models as OPUS-T, OPUS-S, OPUS-M and OPUS-L, with 0.6K, 1.2K, 2.4K and 4.8K queries, respectively. In each model, all queries predict an equal number of points, totalling 76.8K points in the final stage. The sampling number in our CPS is 4 for OPUS-T and 2 for other models. All models are trained on 8 nvidia 4090 GPUs with a batch size of 8 using the Adam W [26] optimizer. The learning rate warms up to 2e 4 in the first 500 iterations and then decays with a Cosine Annealing [25] scheme. Unless otherwise stated, models in main results are trained for 100 epochs and those in the ablation study are trained for 12 epochs. |