Asynchrony-Robust Collaborative Perception via Bird's Eye View Flow
Authors: Sizhe Wei, Yuxi Wei, Yue Hu, Yifan Lu, Yiqi Zhong, Siheng Chen, Ya Zhang
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments conducted on both IRV2V and the real-world dataset DAIR-V2X show that Co BEVFlow consistently outperforms other baselines and is robust in extremely asynchronous settings. The code is available at https://github.com/Media Brain-SJTU/Co BEVFlow. |
| Researcher Affiliation | Collaboration | 1 Cooperative Medianet Innovation Center, Shanghai Jiao Tong University 2 University of Southern California 3 Shanghai AI Laboratory 1 {sizhewei, wyx3590236732, 18671129361, yifan_lu}@sjtu.edu.cn {sihengc, ya_zhang}@sjtu.edu.cn 2 yiqizhon@usc.edu |
| Pseudocode | No | The paper describes the system architecture and processes in text and with equations, but does not include a formal pseudocode block or algorithm listing. |
| Open Source Code | Yes | The code is available at https://github.com/Media Brain-SJTU/Co BEVFlow. |
| Open Datasets | Yes | To facilitate research on asynchrony for collaborative perception, we simulate the first collaborative perception dataset with different temporal asynchronies based on CARLA [39], named IRregular V2V(IRV2V). ... DAIR-V2X. DAIR-V2X [14] is a real-world collaborative perception dataset. |
| Dataset Splits | Yes | We have split the dataset into training, validation, and testing sets, which contain 5,445, 994, and 2,010 samples, respectively. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using Point Pillars[40] and adopting settings from Co Align[17] for the backbone, but does not provide specific version numbers for software dependencies like Python, PyTorch, CUDA, etc. |
| Experiment Setup | Yes | We conduct training for a total of 60 epochs, starting with an initial learning rate of 2e-3. Subsequently, at the 10th and 20th epochs, the learning rate decreases to 10% of its previous value. For IRV2V dataset, we set the lidar range as x [ 140.8, +140.8]m, y [ 40, +40]m. The voxel size is h = w = 0.4m. The feature map s size is H = 200, W = 704. For DAIR-V2X dataset, we set the lidar range as x [ 100.8, +100.8]m, y [ 40, +40]m. The voxel size is h = w = 0.4m. The feature map s size is H = 200, W = 504. |