Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

TAPIP3D: Tracking Any Point in Persistent 3D Geometry

Authors: Bowei Zhang, Lei Ke, Adam Harley, Katerina Fragkiadaki

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We test TAPIP3D in established 3D point tracking benchmarks of TAPVid3D [21], LSFOdyssey [40], Dynamic Replica [18] and Dex YCB [4] for tracking points in 2D and 3D... We show TAPIP3D outperforms all previous methods in 3D point tracking metrics... In Tables 4 and 5 we ablate our model s design choices...
Researcher Affiliation Academia Bowei Zhang1,2 Lei Ke1 Adam W. Harley3 Katerina Fragkiadaki1 1Carnegie Mellon University 2Peking University 3Stanford University
Pseudocode No The paper describes the method using textual descriptions and architectural diagrams (e.g., Figure 2) but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No We will release our code upon acceptance.
Open Datasets Yes Our model is trained on the Kubric MOVi-F dataset [9]. We evaluate TAPIP3D on both 3D and 2D point tracking benchmarks... TAPVid3D [21], LSFOdyssey [40], Dynamic Replica [18] and Dex YCB [4]
Dataset Splits Yes Our model is trained on the Kubric MOVi-F dataset [9]. We evaluate TAPIP3D on both 3D and 2D point tracking benchmarks... TAPVid3D [21], LSFOdyssey [40], Dynamic Replica [18] and Dex YCB [4]
Hardware Specification Yes We train on 8 L40S GPUs with a batch size of 1 for 200K iterations... During inference, under BF16 mixed precision, our model achieves a speed of 10 FPS and consumes around 2.6GB of VRAM when tracking 1024 query points across 32 frames on an L40S GPU.
Software Dependencies No The paper mentions using Adam W [28] as the optimizer, but does not provide specific version numbers for programming languages (e.g., Python), deep learning frameworks (e.g., PyTorch, TensorFlow), or other software libraries used.
Experiment Setup Yes Implementation Details Our model is trained on the Kubric MOVi-F dataset [9]. We initialize the image encoder with Co Tracker3 s pre-trained weights [19]. We train on 8 L40S GPUs with a batch size of 1 for 200K iterations... We optimize using Adam W [28] with the learning rate and weight decay both set to 5e-4... Table 6: Training hyperparameters: Learning rate 0.0005, Weight decay 0.0005, Iteration refinements (Mtrain) 4, LR schedule One Cycle LR, Training steps 200,000, Batch size 8, Optimizer Adam W, Max grad norm 10.0, Visibility loss weight (Ξ±vis) 3.0, Loss discount factor (Ξ³) 0.8, Total loss multiplier 0.005