Neural Fourier Transform: A General Approach to Equivariant Representation Learning
Authors: Masanori Koyama, Kenji Fukumizu, Kohei Hayashi, Takeru Miyato
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also provide experimental results to demonstrate the application of NFT in typical scenarios with varying levels of knowledge about the acting group. 5 EXPERIMENTS |
| Researcher Affiliation | Collaboration | 1Preferred Networks, Inc. 2The Institute of Statistical Mathematics 3University of Tübingen |
| Pseudocode | Yes | from t o r c h import nn from einops . l a y e r s . t o r c h import Rearrange c l a s s Encoder Adapter ( nn . Module ) : def _ _ i n i t _ _ ( s e l f , num_patches , embed_dim , d_a , d_m ) : s e l f . net = nn . S e q u e n t i a l ( nn . Linear ( embed_dim , embed_dim / / 4) , Rearrange ( b n c > b c n ) , nn . Linear ( num_patches , num_patches / / 4) , nn .GELU( ) , nn . Layer Norm ( [ embed_dim / / 4 , num_patches / / 4 ] ) , Rearrange ( b c n > b ( c n ) ) , nn . Linear ( embed_dim * num_patches / / 16 , d_a * d_m ) , Rearrange ( b (m a ) > b m a , m=d_m ) , ) def forward ( s e l f , encoder_output ) : r e t u r n s e l f . net ( encoder_output ) |
| Open Source Code | Yes | Codes are in supplementary material, details are in Appendix D. Codes are in supplementary material. |
| Open Datasets | Yes | U-NFT trained on CIFAR100 (Krizhevsky et al., 2009) sequences with T = 4. We applied g-NFT to MNIST (Le Cun et al., 1998) dataset with SO(2) rotation action and used it as a tool in unsupervised representation learning for OOD generalization. rotated Fashion-MNIST (Xiao et al., 2017) and rotated Kuzushiji-MNIST (Clanuwat et al., 2018) (two out-domains). We used three datasets: Model Net10-SO3 (Liao et al., 2019) in 64 64 resolution, BRDFs (Greff et al., 2022) (224 224), and ABO-Material (Collins et al., 2022) (224 224). |
| Dataset Splits | Yes | The dataset was randomly partitioned into training (80%), validation (10%), and test (10%). |
| Hardware Specification | Yes | We trained each model for 200 epochs, which took less than 1 hour with a single V100 GPU. All the experiments herein were conducted with 4 V100 GPUs. |
| Software Dependencies | No | For steerable networks, we implemented them based on escnn library (Cesa et al., 2022) We used the same architecture described in https://github.com/QUVA-Lab/escnn/blob/master/examples/model.ipynb for Cn CNN and https://uvadlc-notebooks.readthedocs.io/en/latest/ tutorial_notebooks/DL2/Geometric_deep_learning/tutorial2_ steerable_cnns.html#SO(2)-equivariant-architecture for SO(2) CNN. The hyperparameters of each baseline, such as the learning rate and lmax, were selected by Optuna (Akiba et al., 2019). Specific version numbers for software dependencies are not provided. |
| Experiment Setup | Yes | We used Adam W optimizer (Loshchilov and Hutter, 2017) with β1 = 0.9 and β2 = 0.999. Adam W optimizer with batch size 48, learning rate 10 4, and weight decay 0.05. |