Amortized Eigendecomposition for Neural Networks
Authors: Tianbo Li, Zekun Shi, Jiaxi Zhao, Min Lin
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical studies on nuclear norm regularization, latent-space principal component analysis, and graphs adversarial learning demonstrate significant improvements in training efficiency while producing nearly identical outcomes to conventional approaches. |
| Researcher Affiliation | Collaboration | Tianbo Li1, , Zekun Shi1,2, Jiaxi Zhao3, Min Lin1 1SEA AI Lab 2 School of Computing, National University of Singapore 3 Department of Mathematics, National University of Singapore |
| Pseudocode | Yes | Algorithm 1 The conventional eigendecomposition in a neural network outlined in Eq. (2); Algorithm 2 The amortized eigendecomposition technique outlined in Eq. (12) |
| Open Source Code | No | Our code will be released upon acceptance. |
| Open Datasets | Yes | We measure the mean square error (MSE) of the eigenvalues to the number of training iterations. The results are illustrated in Figure 3. ... training an auto-encoder on the MNIST dataset... Latent-space PCA method, as described in Eq. (4), using the MNIST dataset. ... on several citation networks, namely Cora, Citeseer, and Pubmed. |
| Dataset Splits | Yes | The nodes are partitioned randomly into training, validation, and test sets with respective proportions of 60%, 20%, and 20%. |
| Hardware Specification | Yes | All the experiments of our approach are conducted on a single NVIDIA A100 GPU with 40GB memory. |
| Software Dependencies | No | We implement our approach with the deep learning framework JAX [2]. All the experiments of our approach are conducted on a single NVIDIA A100 GPU with 40GB memory. The algorithms are provided by the Optax and JAX-opt libraries [7]. No specific version numbers are provided for these software components. |
| Experiment Setup | Yes | We randomly generate ten symmetric matrices of size 1000 1000. ... by minimizing Brockett s cost function and convex trace loss (we adopt f(x) = x1.5 as the convex function). To achieve this, we employ several optimization algorithms, including Adam [20], Adamax [20], Yogi [46], SGD, and L-BFGS [28]. ... The architectures of the encoder and the decoder are constructed as a 2-layer MLP with hidden layer dimensions of D = 128, 256, and 512. ... Our architecture consists of a three-layer GCN, ... Each layer has a hidden dimension of 32. The dropout rates are set to 0.4 for Cora and Citeseer, and to 0.1 for Pubmed, to prevent overfitting. For optimization, we employ the Adam algorithm with a learning rate of 10 3. |