Truly Scale-Equivariant Deep Nets with Fourier Layers

Authors: Md Ashiqur Rahman, Raymond A. Yeh

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct our experiments on the MNIST-scale [40] and STL [4] dataset. By design, our method achieves zeros scale equivariance-error both in theory and in practice. In terms of accuracy, we compare to recent scale-equivariant CNNs. We found our approach to be competitive in classification accuracy and exhibit better data efficiency in low-resource settings. Our contributions are as follows: We conduct extensive experiments validating the proposed approach. On MNIST and STL datasets, the proposed model achieves an absolute zero end-to-end scale-equivariance error while maintaining competitive classification accuracy.
Researcher Affiliation Academia Md Ashiqur Rahman Raymond A. Yeh Department of Computer Science, Purdue University {rahman79, rayyeh}@purdue.edu
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide any concrete access (link, explicit statement of release) to open-source code for the methodology.
Open Datasets Yes We conduct our experiments on the MNIST-scale [40] and STL [4] dataset. Each image in the original MNIST dataset is randomly downsampled with a factor of [ 1 0.3 1], such that every resolution from 8 8 to 28 28 contains an equal number of samples. Each image of the dataset is randomly scaled with a randomly chosen downsampling factor between [1 2] such that every resolution from 48 to 97 contains an equal number of samples.
Dataset Splits Yes We used 10k, 2k, and 50k for training, validation, and test set samples. We use 7k, 1k, and 5k samples in our training, validation, and test set.
Hardware Specification No The paper mentions general hardware terms like "Modern GPUs" and "executed on a GPU" but does not specify any exact models or detailed specifications of the hardware used for experiments.
Software Dependencies No The paper mentions using "Adam optimizer" but does not provide specific version numbers for any software libraries or dependencies (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes For the baselines and CNN, we follow the implementation, hyperparameters, and architecture provided in prior works [41, 42]. For Fourier CNN, we use the Fourier block introduced in the Fourier Neural operator [18]. Inspired by their design, we use 1 1 complex convolution in the Fourier domain along with the scale-equivariant convolution. We follow the baseline for all training hyper-parameters, except we included a weight decay of 0.01. All of the models are trained for 250 epochs with Adam optimizer with an initial learning rate of 0.01. The learning rate is reduced by a factor of 0.1 after every 100 epoch.