Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
TAPIP3D: Tracking Any Point in Persistent 3D Geometry
Authors: Bowei Zhang, Lei Ke, Adam Harley, Katerina Fragkiadaki
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test TAPIP3D in established 3D point tracking benchmarks of TAPVid3D [21], LSFOdyssey [40], Dynamic Replica [18] and Dex YCB [4] for tracking points in 2D and 3D... We show TAPIP3D outperforms all previous methods in 3D point tracking metrics... In Tables 4 and 5 we ablate our model s design choices... |
| Researcher Affiliation | Academia | Bowei Zhang1,2 Lei Ke1 Adam W. Harley3 Katerina Fragkiadaki1 1Carnegie Mellon University 2Peking University 3Stanford University |
| Pseudocode | No | The paper describes the method using textual descriptions and architectural diagrams (e.g., Figure 2) but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | We will release our code upon acceptance. |
| Open Datasets | Yes | Our model is trained on the Kubric MOVi-F dataset [9]. We evaluate TAPIP3D on both 3D and 2D point tracking benchmarks... TAPVid3D [21], LSFOdyssey [40], Dynamic Replica [18] and Dex YCB [4] |
| Dataset Splits | Yes | Our model is trained on the Kubric MOVi-F dataset [9]. We evaluate TAPIP3D on both 3D and 2D point tracking benchmarks... TAPVid3D [21], LSFOdyssey [40], Dynamic Replica [18] and Dex YCB [4] |
| Hardware Specification | Yes | We train on 8 L40S GPUs with a batch size of 1 for 200K iterations... During inference, under BF16 mixed precision, our model achieves a speed of 10 FPS and consumes around 2.6GB of VRAM when tracking 1024 query points across 32 frames on an L40S GPU. |
| Software Dependencies | No | The paper mentions using Adam W [28] as the optimizer, but does not provide specific version numbers for programming languages (e.g., Python), deep learning frameworks (e.g., PyTorch, TensorFlow), or other software libraries used. |
| Experiment Setup | Yes | Implementation Details Our model is trained on the Kubric MOVi-F dataset [9]. We initialize the image encoder with Co Tracker3 s pre-trained weights [19]. We train on 8 L40S GPUs with a batch size of 1 for 200K iterations... We optimize using Adam W [28] with the learning rate and weight decay both set to 5e-4... Table 6: Training hyperparameters: Learning rate 0.0005, Weight decay 0.0005, Iteration refinements (Mtrain) 4, LR schedule One Cycle LR, Training steps 200,000, Batch size 8, Optimizer Adam W, Max grad norm 10.0, Visibility loss weight (Ξ±vis) 3.0, Loss discount factor (Ξ³) 0.8, Total loss multiplier 0.005 |