RETR: Multi-View Radar Detection Transformer for Indoor Perception

Authors: Ryoma Yataka, Adriano Cardace, Perry Wang, Petros Boufounos, Ryuhei Takahashi

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Evaluated on two indoor radar perception datasets, our approach outperforms existing stateof-the-art methods by a margin of 15.38+ AP for object detection and 11.91+ Io U for instance segmentation, respectively.
Researcher Affiliation Collaboration 1Mitsubishi Electric Research Laboratories (MERL), USA 2Department of Computer Science and Engineering, University of Bologna, Italy 3Information Technology R&D Center (ITC), Mitsubishi Electric Corporation, Japan
Pseudocode No The paper describes the architecture and processes in text and diagrams (Figure 3, 7, 8) but does not present them in formal pseudocode or algorithm blocks.
Open Source Code Yes Our implementation is available at https://github.com/merlresearch/radar-detection-transformer.
Open Datasets Yes We evaluate performance over two open indoor radar perception datasets: MMVR4 [26] and HIBER5 [38]. https://zenodo.org/records/12611978 https://github.com/Intelligent-Perception-Lab/HIBER
Dataset Splits Yes For the training-validation-test split, we follow the data split S1 as defined in MMVR. Table 5: Details of hyper-parameters. ... # of training 190441 / 118280 # of validation 23899 / 33841 # of test 23458 / 85677
Hardware Specification Yes Table 5: Details of hyper-parameters. ... GPU (NVIDIA) A40
Software Dependencies No The paper mentions using 'Res Net' as a backbone, but does not specify software dependencies like programming language versions (e.g., Python 3.x) or library versions (e.g., PyTorch 1.x, TensorFlow 2.x).
Experiment Setup Yes Table 5: Details of hyper-parameters. The table lists specific values for parameters such as 'Total dimension of positional embedding', 'Ratio of depth dimension for TPE', '# of input frames', 'Top-K selection magnitude', '# of encoder blocks', '# of decoder blocks', '# of head of multi-head attention', '# of queries', 'Threshold for detection and segmentation', 'Loss weight for GIo U on horizontal plane', 'Loss weight for L1 on horizontal plane', 'Batch size', 'Epoch for detection', 'Epoch for segmentation', 'Patience for early stopping', 'Learning rate', 'Sheduler', and 'Weight decay'.