Neural Characteristic Activation Analysis and Geometric Parameterization for ReLU Networks

Authors: Wenlin Chen, Hong Ge

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This section contains empirical evaluation of Gm P with neural network architectures of different sizes on both illustrative demonstrations and more challenging machine learning classification and regression benchmarks.
Researcher Affiliation Academia Wenlin Chen University of Cambridge MPI for Intelligent Systems wc337@cam.ac.uk Hong Ge University of Cambridge hg344@cam.ac.uk
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at https://github.com/Wenlin-Chen/geometric-parameterization.
Open Datasets Yes We evaluate Gm P on 7 regression problems from the UCI dataset [11]. We evaluate Gm P with a medium-sized convolutional neural network VGG-6 [58] on Image Net32 [8] We evaluate Gm P with a large residual neural network, Res Net-18 [22], on the full Image Net (ILSVRC 2012) dataset [10].
Dataset Splits Yes We train an MLP with one hidden layer and 100 hidden units for 10 different random 80/20 train/test splits. Image Net (ILSVRC 2012) dataset [10], which consists of 1,281,167 training images and 50,000 validation images.
Hardware Specification Yes All models are trained on a single NVIDIA Ge Force RTX 2080 Ti. All models are trained on a single NVIDIA A100 (80GB).
Software Dependencies No The paper mentions using optimizers like Adam and SGD and implies a deep learning framework, but it does not provide specific version numbers for any software dependencies (e.g., PyTorch version, Python version, CUDA version).
Experiment Setup Yes We use the Adam optimizer [28] with full-batch training. We use cross-validation to select the learning rate for each compared method from the set {0.001, 0.003, 0.01, 0.03, 0.1, 0.3}. We find that the optimal initial learning rate is 0.1 for Gm P and 0.01 for all the other compared methods.