Learning Neural Networks with Adaptive Regularization
Authors: Han Zhao, Yao-Hung Hubert Tsai, Russ R. Salakhutdinov, Geoffrey J. Gordon
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we demonstrate that the proposed method helps networks converge to local optima with smaller stable ranks and spectral norms. These properties suggest better generalizations and we present empirical results to support this expectation. We also verify the effectiveness of the approach on multiclass classification and multitask regression problems with various network structures. |
| Researcher Affiliation | Collaboration | Han Zhao , Yao-Hung Hubert Tsai , Ruslan Salakhutdinov , Geoffrey J. Gordon Carnegie Mellon University, Microsoft Research Montreal {han.zhao,yaohungt,rsalakhu}@cs.cmu.edu geoff.gordon@microsoft.com |
| Pseudocode | Yes | Algorithm 1 Block Coordinate Descent for Adaptive Regularization |
| Open Source Code | Yes | Our code is publicly available at: https://github.com/ yaohungt/Adaptive-Regularization-Neural-Network. |
| Open Datasets | Yes | Multiclass Classification (MNIST & CIFAR10): In this experiment, we show that Ada Reg provides an effective regularization on the network parameters. Multitask Regression (SARCOS): SARCOS relates to an inverse dynamics problem for a seven degree-of-freedom (DOF) SARCOS anthropomorphic robot arm [41]. |
| Dataset Splits | No | The paper explicitly mentions 'training set' and 'test set' sizes and usage for SARCOS, MNIST, and CIFAR10, but it does not provide explicit details about a separate validation set split or its methodology. |
| Hardware Specification | No | The acknowledgments section mentions 'Nvidia GPU grant' and 'NVIDIA’s GPU support,' but it does not specify the exact models or configurations of the hardware (e.g., specific GPU series, CPU type, memory) used for the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python version, specific deep learning framework versions like PyTorch or TensorFlow). |
| Experiment Setup | Yes | We also note that we fix all the hyperparameters such as learning rate to be the same for all the methods. We study two minibatch settings for 256 and 2048, respectively. In this experiment, we fix the number of outer loop to be 2/5 and each block optimization over network weights contains 50 epochs. |