Objects in Semantic Topology

Authors: Shuo Yang, Peize Sun, Yi Jiang, Xiaobo Xia, Ruiheng Zhang, Zehuan Yuan, Changhu Wang, Ping Luo, Min Xu

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 EXPERIMENTS We introduce the evaluation protocol, including datasets and evaluation metrics, implementation details, and experimental results in this section.
Researcher Affiliation Collaboration 1University of Technology Sydney 2The University of Hong Kong 3Byte Dance AI Lab 4University of Sydney 5Beijing Institute of Technology
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper states: 'To make a fair comparison, we report experimental results obtained by re-running the ORE (Joseph et al., 2021) official code1. All the hyper-parameters and optimizers are also controlled to be exactly the same for all methods.' The provided link (https://github.com/JosephKJ/OWOD) is for the baseline ORE's official code, not the code for the method proposed in this paper.
Open Datasets Yes Following (Joseph et al., 2021), the open-world detector is evaluated on all 80 object classes from Pascal VOC (Everingham et al., 2010) (20 classes) and MS-COCO (Lin et al., 2014) (20+60 classes).
Dataset Splits Yes The open-world object detector is trained on the training set of all classes from Pascal VOC and MS-COCO, and evaluated on the Pascal VOC test split and MS-COCO val split. The validation set consists of 1k images from the training data of each task.
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments, such as GPU or CPU models.
Software Dependencies No The paper mentions software components like Faster R-CNN, ResNet-50, CLIPtext, and BERT, but it does not provide specific version numbers for any of these dependencies, which are required for reproducibility.
Experiment Setup Yes The Ro I feature is extracted from the last residual block in the Ro I head, which has a 2048 dimension. The semantic projector is a fully-connected layer to align the dimension of Ro I features with the semantic anchors. The dimension of semantic anchors depends on the choice of pre-trained language model, e.g., the semantic anchor is 512-dim when using CLIP text encoder... For the topology stabilization, we store 100 instances per class...