Rethinking Alignment in Video Super-Resolution Transformers

Authors: Shuwei Shi, Jinjin Gu, Liangbin Xie, Xintao Wang, Yujiu Yang, Chao Dong

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that: (i) VSR Transformers can directly utilize multi-frame information from unaligned videos, and (ii) existing alignment methods are sometimes harmful to VSR Transformers. ... In this section, we describe the settings used in our experiments in detail, including the Transformer architecture, alignment methods, datasets, metrics and implementation details.
Researcher Affiliation Collaboration Shuwei Shi1,2, , Jinjin Gu3,4, , Liangbin Xie2,5,6, Xintao Wang6, Yujiu Yang1, Chao Dong2,3, 1 Shenzhen International Graduate School, Tsinghua University 2 Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences 3 Shanghai AI Laboratory 4 The University of Sydney 5 University of Chinese Academy of Sciences 6 ARC Lab, Tencent PCG
Pseudocode No The paper does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes Codes and models will be released at https://github.com/XPixel Group/Rethink VSRAlignment.
Open Datasets Yes In the VSR literature, the REDS [33] and the Vimeo-90K [47] datasets are the de-facto benchmarks.
Dataset Splits Yes The REDS [33] and the Vimeo-90K [47] datasets are the de-facto benchmarks. ... We follow the common splitting methods and split the data into training (266 sequences) and testing (4 sequences) sets. Vimeo-90K contains 64,612 and 7,824 video sequences for training and testing, respectively.
Hardware Specification Yes The experiments are implemented based on Py Torch [34] and conducted on NVIDIA A100 GPUs. ... (d) Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes]
Software Dependencies No The paper mentions 'implemented based on Py Torch [34]' and 'We implement all these methods using the Basic SR framework [44]' but does not provide specific version numbers for PyTorch or Basic SR.
Experiment Setup Yes We use the Charbonnier loss [18] as the training objective. The Adam optimization [17] method is used for training with β1 = 0.9 and β2 = 0.999. The initial learning rate is set to 4 10 4, and a Consine Annealing scheme [31] is used to decay the learning rate. The total iteration number is set to 300,000. The mini-batch size is 8, and the LR patch size is 64 64.