Sim2Real Object-Centric Keypoint Detection and Description
Authors: Chengliang Zhong, Chao Yang, Fuchun Sun, Jinshan Qi, Xiaodong Mu, Huaping Liu, Wenbing Huang5440-5449
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments on image matching and 6D pose estimation verify the encouraging generalization ability of our method from simulation to reality. |
| Researcher Affiliation | Collaboration | Chengliang Zhong*1,2, Chao Yang*2, Fuchun Sun2, Jinshan Qi4, Xiaodong Mu1, Huaping Liu2, Wenbing Huang3 1Xi an Research Institute of High-Tech, Xi an 710025, China 2Beijing National Research Center for Information Science and Technology (BNRist), State Key Lab on Intelligent Technology and Systems, Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China 3 Institute for AI Industry Research, Tsinghua University 4Shandong University of Science and Technology, Qingdao 266590, China |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | To bootstrap the object-centric keypoint detection and description, we first create a large-scale object-clustered synthetic dataset that consists of 21 objects from the YCB-Video dataset (Xiang et al. 2018). |
| Dataset Splits | No | The paper describes training on synthetic data generated from YCB-Video and testing on real data from YCB-Video, but it does not specify explicit training/validation/test dataset splits with percentages or counts. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used to run its experiments. |
| Software Dependencies | No | The paper mentions 'Py Torch (Paszke et al. 2019)' and 'Adam (Kingma and Ba 2017)' but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | We choose 20 keypoints of each object to construct positive-negative pairs, and the temperature τ in intraobject Info NCE (van den Oord, Li, and Vinyals 2019) loss and inter-object Info NCE loss are set to 0.07, 0.2 respectively. The data augmentation is composed of color jittering, random gray-scale conversion, gaussian noise, gaussian blur, and random rotation. The δ and N are set to 8 and 16 pixels. We set the trade-off weights of two subparts of descriptor λ1 = 1 and λ2 = 1. Our model is implemented in Py Torch (Paszke et al. 2019) with a mini-batch size of 4 and optimized with the Adam (Kingma and Ba 2017) for 20 epochs, and all the input images are cropped to 320 320. We use a learning rate of 10 4 for the first 15 epochs, which is dropped ten times for the remainder. |