RelationNet++: Bridging Visual Representations for Object Detection via Transformer Decoder
Authors: Cheng Chi, Fangyun Wei, Han Hu
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments are all implemented on the MMDetection v1.1.0 codebase [2]. All experiments are performed on MS COCO dataset[19]. A union of 80k train images and a 35k subset of val images are used for training. Most ablation experiments are studied on a subset of 5k unused val images (denoted as minival). |
| Researcher Affiliation | Collaboration | Cheng Chi Institute of Automation, CAS chicheng15@mails.ucas.ac.cn; Fangyun Wei Microsoft Research Asia fawe@microsoft.com; Han Hu Microsoft Research Asia hanhu@microsoft.com; The work is done when Cheng Chi is an intern at Microsoft Research Asia. |
| Pseudocode | No | The paper describes the steps of its proposed module but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | The code is available at https://github.com/microsoft/Relation Net2. |
| Open Datasets | Yes | Our experiments are all implemented on the MMDetection v1.1.0 codebase [2]. All experiments are performed on MS COCO dataset[19]. |
| Dataset Splits | Yes | A union of 80k train images and a 35k subset of val images are used for training. Most ablation experiments are studied on a subset of 5k unused val images (denoted as minival). |
| Hardware Specification | Yes | The real inference speed of different models using a V100 GPU (fp32 mode is used) are shown in Table 11. |
| Software Dependencies | Yes | Our experiments are all implemented on the MMDetection v1.1.0 codebase [2]. |
| Experiment Setup | Yes | Unless otherwise stated, all the training and inference details keep the same as the default settings in MMDetection, i.e., initializing the backbone using the Image Net [25] pretrained model, resizing the input images to keep their shorter side being 800 and their longer side less than or equal to 1333, optimizing the whole network via the SGD algorithm with 0.9 momentum, 0.0001 weight decay, setting the initial learning rate as 0.02 with the 0.1 decrease at epoch 8 and 11. In the large model experiments in Table 10 and 12, we train 20 epochs and decrease the learning rate at epoch 16 and 19. Multi-scale training is also adopted in large model experiments, for each mini-batch, the shorter side is randomly selected from a range of [400, 1200]. |