Learning Transferable Reward for Query Object Localization with Policy Adaptation
Authors: Tingfeng Li, Shaobo Han, Martin Renqiang Min, Dimitris N. Metaxas
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on corrupted MNIST, CU-Birds, and COCO datasets demonstrate the effectiveness of our approach. In this section, we evaluate the generalization ability of the ordinal embedding as well as the performance of the trained localization agent on both source and target domains. Implementation details. To evaluate the learned ordinal embedding, we use Ord Acc defined as the percentage of images within which the preference between a pair of perturbed boxes is correctly predicted. To evaluate object localization performance, we use the Correct Localization (Cor Loc) (Deselaers et al., 2012) metric, which is defined as the percentage of images correctly localized according to the criterion Io U(bp, g) 0.5, where bp is the predicted box and g is the ground-truth box. |
| Researcher Affiliation | Collaboration | NEC Labs America, Department of Computer Science, Rutgers University {tl601,dnm}@cs.rutgers.edu, {shaobo,renqiang}@nec-labs.com |
| Pseudocode | Yes | Algorithm 1: Training localization agent using the proposed ordinal reward signal. |
| Open Source Code | Yes | 1Code available at https://github.com/litingfeng/Localization-by-OrdEmbed |
| Open Datasets | Yes | We evaluate our approach on distorted versions of the MNIST handwriting, the CUB-200-2011 birds (Wah et al., 2011), and the COCO (Lin et al., 2014) dataset. |
| Dataset Splits | Yes | To match test conditions, the training batch is split into two groups, and c is computed on a small subset that does not overlap with the training images to localize; During test-time adaptation, c becomes the prototype of the test exemplary set Etest. |
| Hardware Specification | Yes | All of our models were trained with the Adam optimizer (Kingma & Ba, 2015). We set margin m = 60 in all the experiments heuristically. All the models take less than one hour to finish training, implemented on Py Torch on a single NVIDIA A100 GPU. |
| Software Dependencies | No | The paper states 'implemented on Py Torch' and refers to the 'Adam optimizer (Kingma & Ba, 2015)', but it does not specify version numbers for PyTorch or any other software libraries, which is required for a reproducible description of ancillary software. |
| Experiment Setup | Yes | Implementation details. To evaluate the learned ordinal embedding, we use Ord Acc defined as the percentage of images within which the preference between a pair of perturbed boxes is correctly predicted. To evaluate object localization performance, we use the Correct Localization (Cor Loc) (Deselaers et al., 2012) metric, which is defined as the percentage of images correctly localized according to the criterion Io U(bp, g) 0.5, where bp is the predicted box and g is the ground-truth box. ... All of our models were trained with the Adam optimizer (Kingma & Ba, 2015). We set margin m = 60 in all the experiments heuristically. ... For MNIST, we use three convolutional layers with Re LU activation after each layer as the image encoder. ... For the CUB and the COCO datasets, we adopt layers before conv5_3 of VGG-16 pre-trained on Image Net as the encoder, unless otherwise specified. ... Table 10: Summary of losses used on different datasets. |