Deep Patch Visual Odometry
Authors: Zachary Teed, Lahav Lipson, Jia Deng
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On Standard benchmarks, DPVO outperforms all prior work... We evaluate DPVO on the Tartan Air (44), TUM-RGBD (33), Eu Ro C (2) and ICL-NUIM (19) benchmarks. We perform ablation experiments on the Tartan Air validation split and show results in Fig. C. |
| Researcher Affiliation | Academia | Zachary Teed Princeton University zteed@princeton.edu Lahav Lipson Princeton University llipson@princeton.edu Jia Deng Princeton University jiadeng@princeton.edu |
| Pseudocode | No | The paper describes the system and its components through text and diagrams (e.g., Fig. 2, Fig. 5, Fig. F), but does not include any section or figure explicitly labeled 'Pseudocode' or 'Algorithm', nor does it present structured code-like blocks. |
| Open Source Code | Yes | Code is available at https://github.com/princeton-vl/DPVO |
| Open Datasets | Yes | DPVO is trained entirely on Tartan Air (44)2, a synthetic dataset. This is the same synthetic training data used by previous VO systems (43; 37). Tartan Air provides a large number of training scenes of both indoor and outdoor environments with varied lighting and weather conditions. The dataset provides depth and camera pose annotations, enabling one to generate optical flow by re-projecting the depth using the camera poses. 2https://theairlab.org/tartanair-dataset/ |
| Dataset Splits | Yes | Tartan Air Validation Split: We use the same 32-sequence validation split as DROID-SLAM and report aggregated results in Fig. 4b and compare with DROID-SLAM and ORB-SLAM3. |
| Hardware Specification | Yes | DROID-SLAM ... averages 40FPS on an RTX-3090 using 8.7GB GPU memory... On an RTX-3090, it averages 60FPS using only 4.9GB of memory... We train for a total of 240k iterations on a single RTX-3090 GPU with a batch size of 1. Training takes 3.5 days. |
| Software Dependencies | No | DPVO is implemented using Py Torch. Our visualizer is implemented using the Pangolin library3. No specific version numbers for these or other software dependencies are provided. |
| Experiment Setup | Yes | We train for a total of 240k iterations on a single RTX-3090 GPU with a batch size of 1. Training takes 3.5 days. We use the Adam W optimizer and start with an initial learning rate of 8e-5 which is decayed linearly during training. We apply standard augmentation techniques such as resizing and color jitter. We train on sequences of length 15... We unroll 18 iterations of the update operator during training. ...Ours (Default) uses 96 patches per image and a 10 frame optimization window and Ours (Fast) uses 48 patches and a 7 frame optimization window. |