Semantic-Aware Transformation-Invariant RoI Align

Authors: Guo-Ye Yang, George Kiyohiro Nakayama, Zi-Kai Xiao, Tai-Jiang Mu, Xiaolei Huang, Shi-Min Hu

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that our model significantly outperforms baseline models with slight computational overhead. In addition, it shows excellent generalization ability and can be used to improve performance with various state-of-the-art backbones and detection methods.
Researcher Affiliation Academia 1BNRist Department of Computer Science and Technology, Tsinghua University 2Stanford University 3College of Information Sciences and Technology, Pennsylvania State University
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes The code is available at https://github.com/cxjyxxme/Semantic Ro IAlign.
Open Datasets Yes We conduct our experiments on the MS COCO dataset (Lin et al. 2014), and use the train2017 for training and use the val2017 and the test2017 for testing.
Dataset Splits Yes We conduct our experiments on the MS COCO dataset (Lin et al. 2014), and use the train2017 for training and use the val2017 and the test2017 for testing. The ablation experiments are conducted on the object detection task using the MS COCO validation set.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments are provided. The paper mentions computational costs in terms of 'FLOPs' and 'parameters' but does not specify the hardware used.
Software Dependencies No The paper states 'Our model is implemented based on Jittor (Hu et al. 2020) and JDet1 library.' but does not provide specific version numbers for these software dependencies.
Experiment Setup No The paper states 'The implementation details of our model are given in the supplementary material' but does not provide specific experimental setup details such as hyperparameter values, learning rates, batch sizes, or optimizer settings within the main text.