Fast Fourier Convolution

Authors: Lu Chi, Borui Jiang, Yadong Mu

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experimentally evaluate FFC in three major vision benchmarks (Image Net for image recognition, Kinetics for video action recognition, MSCOCO for human keypoint detection). It consistently elevates accuracies in all above tasks by significant margins.
Researcher Affiliation Academia Lu Chi1, Borui Jiang2, Yadong Mu1 1Wangxuan Institute of Computer Technology, 2Center for Data Science Peking University
Pseudocode Yes Figure 2: Pseudocode of Fourier Unit (FU).
Open Source Code No The paper does not contain an explicit statement about releasing code or a link to a code repository.
Open Datasets Yes We evaluate FFC on three visual tasks: image classification, video action classification and human keypoint detection. The main scope of the first study on Image Net [16]... We choose Kinetics-400 as the testbed... The evaluations are fully conducted on Microsoft COCO keypoint benchmark (http://cocodataset.org).
Dataset Splits Yes The validation accuracies are calculated in the same way as [11, 33, 12] based on 224 224 single center crop.
Hardware Specification No All the networks are optimized by SGD with a batch size of 256 on 4 GPUs. All models are initialized from the pretrained weights on Image Net and trained on 4 GPUs with a batch size of 64 for total 100 epochs.
Software Dependencies No The paper does not explicitly mention specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes Learning rate starts from 0.1 and decreases by a factor of 0.1 after 30, 60 and 80 epochs. Maximal training epochs are set to 90. Linear warm-up strategy is also adopted in the first 5 epochs. All the networks are optimized by SGD with a batch size of 256 on 4 GPUs. Common data augmentation is utilized, such as scale jittering and random flipping. The learning rate starts from 0.01 and decreases by a factor of 10 after 40 and 80 epochs. Dropout (0.5) after the global average pooling and weight decay (0.0001) are adopted to reduce over-fitting during training process.