Neural Fourier Transform: A General Approach to Equivariant Representation Learning

Authors: Masanori Koyama, Kenji Fukumizu, Kohei Hayashi, Takeru Miyato

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We also provide experimental results to demonstrate the application of NFT in typical scenarios with varying levels of knowledge about the acting group. 5 EXPERIMENTS
Researcher Affiliation Collaboration 1Preferred Networks, Inc. 2The Institute of Statistical Mathematics 3University of Tübingen
Pseudocode Yes from t o r c h import nn from einops . l a y e r s . t o r c h import Rearrange c l a s s Encoder Adapter ( nn . Module ) : def _ _ i n i t _ _ ( s e l f , num_patches , embed_dim , d_a , d_m ) : s e l f . net = nn . S e q u e n t i a l ( nn . Linear ( embed_dim , embed_dim / / 4) , Rearrange ( b n c > b c n ) , nn . Linear ( num_patches , num_patches / / 4) , nn .GELU( ) , nn . Layer Norm ( [ embed_dim / / 4 , num_patches / / 4 ] ) , Rearrange ( b c n > b ( c n ) ) , nn . Linear ( embed_dim * num_patches / / 16 , d_a * d_m ) , Rearrange ( b (m a ) > b m a , m=d_m ) , ) def forward ( s e l f , encoder_output ) : r e t u r n s e l f . net ( encoder_output )
Open Source Code Yes Codes are in supplementary material, details are in Appendix D. Codes are in supplementary material.
Open Datasets Yes U-NFT trained on CIFAR100 (Krizhevsky et al., 2009) sequences with T = 4. We applied g-NFT to MNIST (Le Cun et al., 1998) dataset with SO(2) rotation action and used it as a tool in unsupervised representation learning for OOD generalization. rotated Fashion-MNIST (Xiao et al., 2017) and rotated Kuzushiji-MNIST (Clanuwat et al., 2018) (two out-domains). We used three datasets: Model Net10-SO3 (Liao et al., 2019) in 64 64 resolution, BRDFs (Greff et al., 2022) (224 224), and ABO-Material (Collins et al., 2022) (224 224).
Dataset Splits Yes The dataset was randomly partitioned into training (80%), validation (10%), and test (10%).
Hardware Specification Yes We trained each model for 200 epochs, which took less than 1 hour with a single V100 GPU. All the experiments herein were conducted with 4 V100 GPUs.
Software Dependencies No For steerable networks, we implemented them based on escnn library (Cesa et al., 2022) We used the same architecture described in https://github.com/QUVA-Lab/escnn/blob/master/examples/model.ipynb for Cn CNN and https://uvadlc-notebooks.readthedocs.io/en/latest/ tutorial_notebooks/DL2/Geometric_deep_learning/tutorial2_ steerable_cnns.html#SO(2)-equivariant-architecture for SO(2) CNN. The hyperparameters of each baseline, such as the learning rate and lmax, were selected by Optuna (Akiba et al., 2019). Specific version numbers for software dependencies are not provided.
Experiment Setup Yes We used Adam W optimizer (Loshchilov and Hutter, 2017) with β1 = 0.9 and β2 = 0.999. Adam W optimizer with batch size 48, learning rate 10 4, and weight decay 0.05.