Joint-task Self-supervised Learning for Temporal Correspondence

Authors: Xueting Li, Sifei Liu, Shalini De Mello, Xiaolong Wang, Jan Kautz, Ming-Hsuan Yang

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our method outperforms the state-of-the-art self-supervised methods on a variety of visual correspondence tasks, including video-object and part-segmentation propagation, keypoint tracking, and object tracking. Our self-supervised method even surpasses the fully-supervised affinity feature representation obtained from a Res Net-18 pre-trained on the Image Net.We compare with state-of-the-art algorithms [45, 46, 52] on several tasks: instance mask propagation, pose keypoints tracking, human parts segmentation propagation and visual tracking.
Researcher Affiliation Collaboration Xueting Li1 , Sifei Liu2 , Shalini De Mello2, Xiaolong Wang3, Jan Kautz2, Ming-Hsuan Yang1 1University of California, Merced, 2NVIDIA, 3 Carnegie Mellon University
Pseudocode No The paper illustrates its method with data flow diagrams like Figure 2 but does not include explicit pseudocode or algorithm blocks labeled as such.
Open Source Code Yes The project website can be found at https://sites.google.com/view/uvc2019/.
Open Datasets Yes We first train the auto-encoder in the matching module (the encoder E and decoder D in Figure 2) to reconstruct images in the Lab space using the MSCOCO [28] dataset. We then fix it and train the feature representation network using the Kinetics dataset [21].Further mentions of J-HMDB [19], OTB2015 [53], VIP [59].
Dataset Splits No The paper uses various datasets for training and evaluation but does not provide specific train/validation/test splits (percentages, counts, or explicit instructions for splitting).
Hardware Specification Yes We carry out all our experiments on servers equipped with four 16GB Tesla V100 GPUs.
Software Dependencies No The paper mentions using ResNet-18 and Adam as optimizer, but does not provide specific version numbers for any software dependencies like Python, PyTorch, TensorFlow, or other libraries.
Experiment Setup Yes We train our model using Adam [22] as the optimizer with a learning rate of 10^-4 for the warm-up and 0.5 * 10^-4 for the joint training of the localization and matching modules. We set the temperature in the softmax layer applied to the affinity matrix to 1 which empirically achieves best performance.