How2comm: Communication-Efficient and Collaboration-Pragmatic Multi-Agent Perception
Authors: Dingkang Yang, Kun Yang, Yuzheng Wang, Jing Liu, Zhi Xu, Rongbin Yin, Peng Zhai, Lihua Zhang
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our framework is thoroughly evaluated on several Li DAR-based collaborative detection datasets in real-world and simulated scenarios. Comprehensive experiments demonstrate the superiority of How2comm and the effectiveness of all its vital components. |
| Researcher Affiliation | Collaboration | 1Academy for Engineering and Technology, Fudan University 2Cognition and Intelligent Technology Laboratory (CIT Lab) 3Engineering Research Center of AI and Robotics, Ministry of Education 4AI and Unmanned Systems Engineering Research Center of Jilin Province 5FAW (Nanjing) Technology Development Company Ltd |
| Pseudocode | No | The paper describes its architecture and components through textual descriptions and figures, but it does not include any formal pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code will be released at https://github.com/ydk122024/How2comm. |
| Open Datasets | Yes | Multi-Agent 3D Detection Datasets. To evaluate the performance of How2comm on the collaborative perception task, we conduct extensive experiments on three multi-agent datasets, including DAIR-V2X [52], V2XSet [40], and OPV2V [41]. |
| Dataset Splits | Yes | DAIR-V2X [52] is a real-world vehicle-to-infrastructure perception dataset containing 100 realistic scenarios and 18,000 data samples. Each sample collects the labeled Li DAR point clouds of a vehicle and an infrastructure. The training/validation/testing sets are split in a ratio of 5:2:3. V2XSet [40] is a simulated dataset supporting V2X perception, co-simulated by Carla [3] and Open CDA [36]. It includes 73 representative scenes with 2 to 5 connected agents and 11,447 3D annotated Li DAR point cloud frames. The training/validation/testing sets are 6,694, 1,920, and 2,833 frames, respectively. OPV2V [41] is a large-scale simulated dataset for multi-agent V2V perception, comprising 10,914 Li DAR point cloud frames with 3D annotation. The training/validation/testing splits include 6,764, 1,981, and 2,169 frames, respectively. |
| Hardware Specification | Yes | We build all the models using the Pytorch toolbox [18] and train them on Tesla V100 GPUs with the Adam optimizer [10]. |
| Software Dependencies | No | The paper mentions using the 'Pytorch toolbox [18]' but does not provide a specific version number for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | The learning rate is set to 2e-3 and decays exponentially by 0.1 every 15 epochs. The training settings on the DAIR-V2X [52], V2XSet [40], and OPV2V [41] datasets include: the training epochs are {30, 40, 40}, and batch sizes are {2, 1, 1}. |