Wasserstein Distances for Stereo Disparity Estimation
Authors: Divyansh Garg, Yan Wang, Bharath Hariharan, Mark Campbell, Kilian Q. Weinberger, Wei-Lun Chao
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our approach on a variety of tasks, including stereo disparity and depth estimation, and the downstream 3D object detection. ... We conduct comprehensive experiments and show that our algorithm leads to significant improvement in all three tasks. |
| Researcher Affiliation | Academia | Divyansh Garg1 Yan Wang1 Bharath Hariharan1 Mark Campbell1 Kilian Q. Weinberger1 Wei-Lun Chao2 1Cornell University, Ithaca, NY 2The Ohio State University, Columbus, OH |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | Our code will be available at https://github.com/Div99/W-Stereo-Disp. |
| Open Datasets | Yes | We evaluate our method on two challenging stereo benchmark datasets, i.e., Scene Flow [25] and KITTI 2015 [26], and on a 3D object detection benchmark KITTI 3D [9, 10]. |
| Dataset Splits | Yes | KITTI 3D contains 7,481 (pairs of) images for training and 7,518 (pairs of) images for testing. We follow the same training and validation splits as suggested by Chen et al. [6], containing 3,712 and 3,769 images, respectively. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments, only general information about training. |
| Software Dependencies | No | The paper mentions implementing a differentiable loss module in Pytorch but does not provide specific version numbers for software dependencies. |
| Experiment Setup | Yes | For Scene Flow, the models are trained from scratch with a constant learning rate of 0.001 for 10 epochs. For KITTI 2015, the models pre-trained on Scene Flow are fine-tuned following the default strategy of the vanilla models. ... We use a uniform grid of bin size 2 pixels to create the categorical distribution. |