Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

ODAM: Gradient-based Instance-Specific Visual Explanations for Object Detection

Authors: Chenyang ZHAO, Antoni B. Chan

ICLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present a detailed analysis of the visualized explanations of detectors and carry out extensive experiments to validate the effectiveness of the proposed ODAM.
Researcher Affiliation Academia Chenyang Zhao & Antoni B. Chan Department of Computer Science City University of Hong Kong EMAIL, EMAIL
Pseudocode Yes The pseudo-code for Odam-NMS is presented in Algorithm 1.
Open Source Code No No explicit statement about providing open-source code for the methodology was found, nor a direct link to a code repository.
Open Datasets Yes Two datasets are adopted for evaluation: MS COCO (Lin et al., 2014), a standard object detection dataset, and Crowd Human (Shao et al., 2018)
Dataset Splits Yes Besides the MS COCO val set, results of the Pointing game and ODI are also reported on Crowd Human validation sets.
Hardware Specification Yes Experiments are performed using Py Torch and an RTX 3090 GPU.
Software Dependencies No The paper mentions 'Py Torch' but does not specify its version number or any other software dependencies with their specific versions.
Experiment Setup Yes For Odam-Train on MSCOCO, the detector training pipeline is totally the same as the baseline (Tian et al., 2019; Ren et al., 2015), which uses SGD as the optimizer running for 12 epochs with batchsize 16, learning rate 0.2 for two-stage Faster R-CNN and learning rate 0.1 for FCOS. For training on Crowd Human, the aspect ratios of the anchors in Faster R-CNN are set to H : W = {1, 2, 3} : 1 since the dataset contains people, and training runs for 30 epochs. Other parameters are the same as in training on MSCOCO.