Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Point Cluster: A Compact Message Unit for Communication-Efficient Collaborative Perception
Authors: Zihan Ding, Jiahui Fu, Si Liu, Hongyu Li, Siheng Chen, Hongsheng Li, Shifeng Zhang, Xu Zhou
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on serval widely recognized collaborative perception benchmarks showcase the superior performance of our method compared to the previous state-of-the-art approaches. 4 EXPERIMENTS 4.1 DATASETS AND EVALUATION METRICS 4.2 IMPLEMENTATION DETAILS 4.3 COMPARISON WITH STATE-OF-THE-ART METHODS 4.4 ABLATION STUDIES |
| Researcher Affiliation | Collaboration | Zihan Ding1, Jiahui Fu1, Si Liu1 , Hongyu Li1, Siheng Chen2, Hongsheng Li3,4, Shifeng Zhang5, Xu Zhou5 1 Institute of Artificial Intelligence, Beihang University, 2 School of Artificial Intelligence, Shanghai Jiao Tong University, 3 MMLab, CUHK, 4 Centre for Perceptual and Interactive Intelligence, 5 Sangfor Technologies EMAIL |
| Pseudocode | Yes | Algorithm 1 Semantic and Distribution guided Farthest Point Sampling Algorithm. Npoint is the number of input points and Nsample = Npoint ζ is the number of sampled points controlled by a predefined sampling rate ζ. Input: coordinates P = {p1, . . . , p Nfg} RNpoint 3; semantic scores Sf = {s1 f , . . . , s Npoint f } RNpoint; distribution scores Sd = {s1 d, . . . , s Npoint d } RNpoint. Output: sampled key point set e P = {ep1, . . . , ep Nsample} |
| Open Source Code | No | The paper does not contain an explicit statement about releasing their code or a link to a code repository. |
| Open Datasets | Yes | We conducted experiments on three widely used benchmarks for collaborative perception, i.e., V2XSet Xu et al. (2022a), OPV2V Xu et al. (2022b), and DAIR-V2X-C Yu et al. (2022). DAIR-V2X-C Yu et al. (2022) is the first to provide a large-scale collection of real-world scenarios for vehicle-infrastructure collaborative autonomous driving. V2XSet Xu et al. (2022a) is a large-scale V2X perception dataset founded on CARLA Dosovitskiy et al. (2017) and Open CDA Xu et al. (2021). OPV2V Xu et al. (2022b) is a vehicle-to-vehicle collaborative perception dataset, cosimulated by Open CDA Xu et al. (2021) and Carla Dosovitskiy et al. (2017). |
| Dataset Splits | Yes | V2XSet has 11,447 frames (6,694/ 1,920/2,833 for train/validation/test respectively) captured in 55 representative simulation scenes that cover the most common driving scenarios in real life. |
| Hardware Specification | Yes | Adam Kingma & Ba (2014) is employed as the optimizer for training our model end-to-end on NVIDIA Tesla V100 GPUs, with a total of 35 epochs. |
| Software Dependencies | No | Our method is implemented with Py Torch. (Only Py Torch is mentioned, no version number). |
| Experiment Setup | Yes | We set the perception range along the x, y, and z-axis to [ 140.8m, 140.8m] [ 40m, 40m] [ 3m, 1m] for V2XSet and [ 100.8m, 100.8m] [ 40m, 40m] [ 3m, 1m] for DAIR-V2X-C, respectively. The thresholds ϵagg, ϵpose, ϵlatency, and ϵlatency for cluster matching are set as 0.6, 1.5, 0.5 and 2.0, respectively. The number of SIR layers is L1 = 6 in PCE and L2 = 3 during message decoding. The channel number of cluster features is D = 128. Adam Kingma & Ba (2014) is employed as the optimizer for training our model end-to-end on NVIDIA Tesla V100 GPUs, with a total of 35 epochs. The initial learning rate is set as 0.001 and we reduce it by 10 after 20 and 30 epochs, respectively. |