Sim2Real Object-Centric Keypoint Detection and Description

Authors: Chengliang Zhong, Chao Yang, Fuchun Sun, Jinshan Qi, Xiaodong Mu, Huaping Liu, Wenbing Huang5440-5449

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments on image matching and 6D pose estimation verify the encouraging generalization ability of our method from simulation to reality.
Researcher Affiliation Collaboration Chengliang Zhong*1,2, Chao Yang*2, Fuchun Sun2, Jinshan Qi4, Xiaodong Mu1, Huaping Liu2, Wenbing Huang3 1Xi an Research Institute of High-Tech, Xi an 710025, China 2Beijing National Research Center for Information Science and Technology (BNRist), State Key Lab on Intelligent Technology and Systems, Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China 3 Institute for AI Industry Research, Tsinghua University 4Shandong University of Science and Technology, Qingdao 266590, China
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes To bootstrap the object-centric keypoint detection and description, we first create a large-scale object-clustered synthetic dataset that consists of 21 objects from the YCB-Video dataset (Xiang et al. 2018).
Dataset Splits No The paper describes training on synthetic data generated from YCB-Video and testing on real data from YCB-Video, but it does not specify explicit training/validation/test dataset splits with percentages or counts.
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used to run its experiments.
Software Dependencies No The paper mentions 'Py Torch (Paszke et al. 2019)' and 'Adam (Kingma and Ba 2017)' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes We choose 20 keypoints of each object to construct positive-negative pairs, and the temperature τ in intraobject Info NCE (van den Oord, Li, and Vinyals 2019) loss and inter-object Info NCE loss are set to 0.07, 0.2 respectively. The data augmentation is composed of color jittering, random gray-scale conversion, gaussian noise, gaussian blur, and random rotation. The δ and N are set to 8 and 16 pixels. We set the trade-off weights of two subparts of descriptor λ1 = 1 and λ2 = 1. Our model is implemented in Py Torch (Paszke et al. 2019) with a mini-batch size of 4 and optimized with the Adam (Kingma and Ba 2017) for 20 epochs, and all the input images are cropped to 320 320. We use a learning rate of 10 4 for the first 15 epochs, which is dropped ten times for the remainder.