Context-PIPs: Persistent Independent Particles Demands Spatial Context Features

Authors: Weikang Bian, Zhaoyang Huang, Xiaoyu Shi, Yitong Dong, Yijin Li, Hongsheng Li

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Context-PIPs significantly improves PIPs all-sided, reducing 11.4% Average Trajectory Error of Occluded Points (ATE-Occ) on Cro HD and increasing 11.8% Average Percentage of Correct Keypoint (A-PCK) on TAP-Vid-Kinetics. 4 Experiments We evaluate our Context-PIPs on four benchmarks: Flying Things++ [12], Cro HD [38], TAP-Vid DAVIS, and TAP-Vid-Kinetics [5].
Researcher Affiliation Collaboration 1CUHK MMLab 2Centre for Perceptual and Interactive Intelligence 3Zhejiang University wkbian@outlook.com, drinkingcoder@link.cuhk.edu.hk, hsli@ee.cuhk.edu.hk. This project is funded in part by National Key R&D Program of China Project 2022ZD0161100, by the Centre for Perceptual and Interactive Intelligence (CPII) Ltd under the Innovation and Technology Commission (ITC) s Inno HK, by General Research Fund of Hong Kong RGC Project 14204021. Hongsheng Li is a PI of CPII under the Inno HK.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No Demos are available at https://wkbian.github.io/Projects/Context-PIPs/. The paper does not explicitly state that the source code for the method is released or provide a direct link to a code repository.
Open Datasets Yes Flyingthings++ is a synthetic dataset based on Flyingthings3D [26], which contains 8-frame trajectories with occlusion. Crowd of Heads Dataset (Cro HD) is a high-resolution crowd head tracking dataset. TAP-Vid-DAVIS and TAP-Vid-Kinetics are two evaluation datasets in the TAP-Vid benchmark, both of which consist of real-world videos with accurate human annotations for point tracking.
Dataset Splits No The paper mentions training on Flyingthings++ and evaluating on other benchmarks, but does not explicitly provide details about a validation dataset split or how validation was performed during training.
Hardware Specification No The paper discusses computational efficiency (GFLOPS) but does not specify any particular hardware components such as GPU or CPU models used for the experiments.
Software Dependencies No The paper mentions 'pytorch-Op Counter [49]' but does not provide specific version numbers for software dependencies like PyTorch, CUDA, or other libraries.
Experiment Setup Yes We train our Context-PIPs with a batch size of 4 and 100,000 steps on Flyingthings++ with horizontal and vertical flipping. We use the one-cycle learning rate scheduler. The highest learning rate is set as 5e-4. During training, we set the convolution stride to 8 and the resolution of the input RGB images to 384x512, and randomly sample N = 128 visible query points for supervision.