Context-PIPs: Persistent Independent Particles Demands Spatial Context Features
Authors: Weikang Bian, Zhaoyang Huang, Xiaoyu Shi, Yitong Dong, Yijin Li, Hongsheng Li
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Context-PIPs significantly improves PIPs all-sided, reducing 11.4% Average Trajectory Error of Occluded Points (ATE-Occ) on Cro HD and increasing 11.8% Average Percentage of Correct Keypoint (A-PCK) on TAP-Vid-Kinetics. 4 Experiments We evaluate our Context-PIPs on four benchmarks: Flying Things++ [12], Cro HD [38], TAP-Vid DAVIS, and TAP-Vid-Kinetics [5]. |
| Researcher Affiliation | Collaboration | 1CUHK MMLab 2Centre for Perceptual and Interactive Intelligence 3Zhejiang University wkbian@outlook.com, drinkingcoder@link.cuhk.edu.hk, hsli@ee.cuhk.edu.hk. This project is funded in part by National Key R&D Program of China Project 2022ZD0161100, by the Centre for Perceptual and Interactive Intelligence (CPII) Ltd under the Innovation and Technology Commission (ITC) s Inno HK, by General Research Fund of Hong Kong RGC Project 14204021. Hongsheng Li is a PI of CPII under the Inno HK. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | Demos are available at https://wkbian.github.io/Projects/Context-PIPs/. The paper does not explicitly state that the source code for the method is released or provide a direct link to a code repository. |
| Open Datasets | Yes | Flyingthings++ is a synthetic dataset based on Flyingthings3D [26], which contains 8-frame trajectories with occlusion. Crowd of Heads Dataset (Cro HD) is a high-resolution crowd head tracking dataset. TAP-Vid-DAVIS and TAP-Vid-Kinetics are two evaluation datasets in the TAP-Vid benchmark, both of which consist of real-world videos with accurate human annotations for point tracking. |
| Dataset Splits | No | The paper mentions training on Flyingthings++ and evaluating on other benchmarks, but does not explicitly provide details about a validation dataset split or how validation was performed during training. |
| Hardware Specification | No | The paper discusses computational efficiency (GFLOPS) but does not specify any particular hardware components such as GPU or CPU models used for the experiments. |
| Software Dependencies | No | The paper mentions 'pytorch-Op Counter [49]' but does not provide specific version numbers for software dependencies like PyTorch, CUDA, or other libraries. |
| Experiment Setup | Yes | We train our Context-PIPs with a batch size of 4 and 100,000 steps on Flyingthings++ with horizontal and vertical flipping. We use the one-cycle learning rate scheduler. The highest learning rate is set as 5e-4. During training, we set the convolution stride to 8 and the resolution of the input RGB images to 384x512, and randomly sample N = 128 visible query points for supervision. |