Objects in Semantic Topology
Authors: Shuo Yang, Peize Sun, Yi Jiang, Xiaobo Xia, Ruiheng Zhang, Zehuan Yuan, Changhu Wang, Ping Luo, Min Xu
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 EXPERIMENTS We introduce the evaluation protocol, including datasets and evaluation metrics, implementation details, and experimental results in this section. |
| Researcher Affiliation | Collaboration | 1University of Technology Sydney 2The University of Hong Kong 3Byte Dance AI Lab 4University of Sydney 5Beijing Institute of Technology |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper states: 'To make a fair comparison, we report experimental results obtained by re-running the ORE (Joseph et al., 2021) official code1. All the hyper-parameters and optimizers are also controlled to be exactly the same for all methods.' The provided link (https://github.com/JosephKJ/OWOD) is for the baseline ORE's official code, not the code for the method proposed in this paper. |
| Open Datasets | Yes | Following (Joseph et al., 2021), the open-world detector is evaluated on all 80 object classes from Pascal VOC (Everingham et al., 2010) (20 classes) and MS-COCO (Lin et al., 2014) (20+60 classes). |
| Dataset Splits | Yes | The open-world object detector is trained on the training set of all classes from Pascal VOC and MS-COCO, and evaluated on the Pascal VOC test split and MS-COCO val split. The validation set consists of 1k images from the training data of each task. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions software components like Faster R-CNN, ResNet-50, CLIPtext, and BERT, but it does not provide specific version numbers for any of these dependencies, which are required for reproducibility. |
| Experiment Setup | Yes | The Ro I feature is extracted from the last residual block in the Ro I head, which has a 2048 dimension. The semantic projector is a fully-connected layer to align the dimension of Ro I features with the semantic anchors. The dimension of semantic anchors depends on the choice of pre-trained language model, e.g., the semantic anchor is 512-dim when using CLIP text encoder... For the topology stabilization, we store 100 instances per class... |