reproducibilityindex.ai

Amortized Eigendecomposition for Neural Networks

Authors: Tianbo Li, Zekun Shi, Jiaxi Zhao, Min Lin

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical studies on nuclear norm regularization, latent-space principal component analysis, and graphs adversarial learning demonstrate significant improvements in training efficiency while producing nearly identical outcomes to conventional approaches.
Researcher Affiliation	Collaboration	Tianbo Li1, , Zekun Shi1,2, Jiaxi Zhao3, Min Lin1 1SEA AI Lab 2 School of Computing, National University of Singapore 3 Department of Mathematics, National University of Singapore
Pseudocode	Yes	Algorithm 1 The conventional eigendecomposition in a neural network outlined in Eq. (2); Algorithm 2 The amortized eigendecomposition technique outlined in Eq. (12)
Open Source Code	No	Our code will be released upon acceptance.
Open Datasets	Yes	We measure the mean square error (MSE) of the eigenvalues to the number of training iterations. The results are illustrated in Figure 3. ... training an auto-encoder on the MNIST dataset... Latent-space PCA method, as described in Eq. (4), using the MNIST dataset. ... on several citation networks, namely Cora, Citeseer, and Pubmed.
Dataset Splits	Yes	The nodes are partitioned randomly into training, validation, and test sets with respective proportions of 60%, 20%, and 20%.
Hardware Specification	Yes	All the experiments of our approach are conducted on a single NVIDIA A100 GPU with 40GB memory.
Software Dependencies	No	We implement our approach with the deep learning framework JAX [2]. All the experiments of our approach are conducted on a single NVIDIA A100 GPU with 40GB memory. The algorithms are provided by the Optax and JAX-opt libraries [7]. No specific version numbers are provided for these software components.
Experiment Setup	Yes	We randomly generate ten symmetric matrices of size 1000 1000. ... by minimizing Brockett s cost function and convex trace loss (we adopt f(x) = x1.5 as the convex function). To achieve this, we employ several optimization algorithms, including Adam [20], Adamax [20], Yogi [46], SGD, and L-BFGS [28]. ... The architectures of the encoder and the decoder are constructed as a 2-layer MLP with hidden layer dimensions of D = 128, 256, and 512. ... Our architecture consists of a three-layer GCN, ... Each layer has a hidden dimension of 32. The dropout rates are set to 0.4 for Cora and Citeseer, and to 0.1 for Pubmed, to prevent overfitting. For optimization, we employ the Adam algorithm with a learning rate of 10 3.