Fiedler Regularization: Learning Neural Networks with Graph Sparsity
Authors: Edric Tam, David Dunson
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We performed experiments on datasets that compare Fiedler regularization with traditional regularization methods such as Dropout and weight decay. Results demonstrate the efficacy of Fiedler regularization. |
| Researcher Affiliation | Academia | 1Department of Statistical Science, Duke University, Durham, NC, USA. |
| Pseudocode | Yes | Algorithm 1 Variational Fiedler Regularization with SGD |
| Open Source Code | Yes | The code used for the experiments could be found at the first author s Github repository (https://github.com/edrictam/Fiedler Regularization). |
| Open Datasets | Yes | MNIST is a standard handwriting recognition dataset; CIFAR10 is a benchmark object recognition dataset; The TCGA Pan-Cancer tumor classification dataset (Weinstein et al., 2013) from the UCI Machine Learning Repository. |
| Dataset Splits | No | To select the dropping probability for Dropout, as well as the regularization hyperparameter for L1, Fiedler regularization and weight decay, we performed a very rough grid search on a small validation dataset. However, specific split details (percentages or counts) for this validation dataset are not provided. |
| Hardware Specification | Yes | All experiments were run on a Unix machine with an Intel Core i7 processor. |
| Software Dependencies | Yes | We used PyTorch 1.4 and Python 3.6 for all experiments. |
| Experiment Setup | Yes | For optimization, we adopted stochastic gradient descent with a momentum of 0.9 for optimization and a learning rate of 0.001. The Dropout probability is selected to be 0.5 for all layers, and the regularization hyperparameters for L1, Fiedler regularization and weight decay are 0.001, 0.01 and 0.01 respectively. All models in the experiments were trained under the cross-entropy loss. Hidden layers: 500 units (MNIST, CIFAR10), 50 units (TCGA). Batch size: 100 (MNIST, CIFAR10), 10 (TCGA). Epochs: 10 (MNIST, CIFAR10), 5 (TCGA). |