Win-Win: Training High-Resolution Vision Transformers from Two Windows
Authors: Vincent Leroy, Jerome Revaud, Thomas Lucas, Philippe Weinzaepfel
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we first validate our Win-Win training strategy on a monocular task (semantic segmentation) in Section 4.1 and then present results for the binocular task of optical flow (Section 4.2). Please refer to Appendix D for more results on the monocular depth estimation task. |
| Researcher Affiliation | Industry | Vincent Leroy, Jerome Revaud, Thomas Lucas & Philippe Weinzaepfel Naver Labs Europe firstname.lastname@naverlabs.com |
| Pseudocode | No | The paper does not contain any explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not explicitly state that open-source code for the methodology is provided, nor does it include a link to a repository. |
| Open Datasets | Yes | Experiments are performed on the BDD-100k dataset (Yu et al., 2020) that comprise 7,000 training images and 1,000 validation images in a driving scenario with 19 semantic classes. All images have a relatively high resolution of 1280 720 pixels. |
| Dataset Splits | Yes | Experiments are performed on the BDD-100k dataset (Yu et al., 2020) that comprise 7,000 training images and 1,000 validation images in a driving scenario with 19 semantic classes. Models are trained on Flying Chairs (Dosovitskiy et al., 2015), Flying Things (Mayer et al., 2016), and MPI-Sintel from which we keep two sequences apart for validation. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions optimizers and their parameters but does not specify software dependencies like programming languages, libraries, or frameworks with version numbers (e.g., Python, PyTorch, TensorFlow). |
| Experiment Setup | Yes | We use the Adam W (Loshchilov & Hutter, 2019) optimizer, with betas of 0.9 and 0.999, a cosine learning rate schedule with a base learning rate of 0.0001, with two warmup epochs, a weight decay of 0.05 and a learning rate layer decay of 0.75. We train our models for 200 epochs on the 7,000 training images from the BDD10k dataset... |