reproducibilityindex.ai

Object Pursuit: Building a Space of Objects via Discriminative Weight Generation

Authors: Chuanyu Pan, Yanchao Yang, Kaichun Mo, Yueqi Duan, Leonidas Guibas

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform an extensive study of the key features of the proposed framework and analyze the characteristics of the learned representations. Furthermore, we demonstrate the capability of the proposed framework in learning representations that can improve label efﬁciency in downstream tasks. Our code and trained models are made publicly available at: https://github.com/pptrick/ Object-Pursuit. We also perform experiments on one-shot and few-shot learning, and show the potential of the learned object-centric representations in effectively reducing supervisions for object detection.
Researcher Affiliation	Academia	Chuanyu Pan1,, Yanchao Yang2,, Kaichun Mo2, Yueqi Duan1,2, Leonidas Guibas2 1Tsinghua University 2Stanford University pancy17@mails.tsinghua.edu.cn {yanchaoy, kaichun, guibas}@cs.stanford.edu duanyueqi@tsinghua.edu.cn
Pseudocode	Yes	The proposed object pursuit framework is also summarized in Algorithm. 1. A.1 ALGORITHM Here is the Object Pursuit algorithm we describe in the method section. Algorithm 1: Object Pursuit
Open Source Code	Yes	Our code and trained models are made publicly available at: https://github.com/pptrick/ Object-Pursuit.
Open Datasets	Yes	To learn diverse objects from variant positions and viewing angles, we collect synthetic data within the i Thor environment ((Kolve et al., 2017)), which provides a set of interactive objects and scenes, as well as accurate modeling of the physics. You Tube-VOS We train and evaluate our framework on the Youtube-VOS dataset, which contains 65 categories. CO3D We also test our framework on CO3D. We perform one-shot learning on the DAVIS 2016 dataset (Perazzi et al., 2016), a video object segmentation dataset in the real scene.
Dataset Splits	Yes	The 138 objects are divided into 52 pretraining objects, 60 train objects for the pursuit process, and 25 test unseen objects. For evaluation, we preserve a separate set of 25 objects (unseen test objects) that never appear during training. And we use 27 objects (seen test objects) from the warp-up joint training described above to check the re-identiﬁcation accuracy. Under the one-shot learning scheme, we ﬁx the hypernet and the bases, initialize the combination coefﬁcient µ with only one training sample (ﬁrst frame in the sequence). From µ, we can get the representation z for the training object, generate the parameters of a segmentation network using the hypernet, then evaluate the segmentation accuracy.
Hardware Specification	No	I did not find any specific hardware details such as GPU models (e.g., NVIDIA A100, RTX 2080 Ti, Tesla V100), CPU models, or cloud computing instance types used for running the experiments.
Software Dependencies	No	The paper mentions using 'Deeplab v3+' as the segmentation network and 'resnet18' as the backbone, along with the 'dice score' for similarity measure. However, it does not specify software versions for programming languages, libraries (e.g., PyTorch, TensorFlow), or other dependencies.
Experiment Setup	Yes	The sparsity constraint α is set to 0.2, 0.1 for Eq. 3 and Eq. 4 respectively, and β = 0.04 for all our experiments. To improve the convergence, we also warm up the hypernetwork using the pretraining objects. During pretraining, each mini-batch contains training data from one object, and we randomly choose which object to use in the next batch. In backpropagation, we update the hypernetwork ψ and representation z for each object.