Feature Grouping as a Stochastic Regularizer for High-Dimensional Structured Data

Authors: Sergul Aydore, Bertrand Thirion, Gael Varoquaux

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on two real-world datasets demonstrate that the proposed approach produces models that generalize better than those trained with conventional regularizers, and also improves convergence speed, and has a linear computational cost.
Researcher Affiliation Academia 1Stevens Institute of Technology, New Jersey, USA 2Inria Saclay, Palaiseau, France. Correspondence to: Sergul Aydore <sergulaydore@gmail.com>.
Pseudocode Yes Algorithm 1 Training of a Neural Network with Feature Grouping as a Stochastic Regularizer
Open Source Code Yes Our implementation is openly available at https://github.com/sergulaydore/ Feature-Grouping-Regularizer.
Open Datasets Yes Olivetti Faces: The Olivetti dataset consists of grayscale 64 64 face images from 40 subjects (HOPPER, 1992). HCP: The Human Connectome Project (HCP) has released a large openly-accessible f MRI dataset. Here we use task f MRI that includes seven tasks: 1. Working Memory, 2. Gambling, 3. Motor, 4. Language 5. Social Cognition, 6. Relational Processing, and 7. Emotion Processing. These tasks have been chosen to map different brain systems. The dataset includes 500 different subjects with images registered to the standard MNI atlas. For a given subject and task, a GLM was fitted to each f MRI dataset (Barch et al., 2013).
Dataset Splits No The paper mentions using "validation loss" for early stopping ("We applied early stopping on MLP and CNN architectures when the validation loss stopped improving in 10 (also known as patience parameter) subsequent epochs.") but does not specify the size, percentage, or methodology for creating the validation set split.
Hardware Specification Yes Experiments are run using Nvidia Ge Force GTX 1060 and 16GB RAM.
Software Dependencies Yes We use Python 3.6 for implementation (Oliphant, 2007) using open-source libraries Py Torch (Paszke et al., 2017), scikit-learn (Pedregosa et al., 2011), Ni Babel (Brett et al., 2016), nilearn (Abraham et al., 2014), joblib (Varoquaux & Grisel, 2009) and Num Py (Walt et al., 2011).
Experiment Setup Yes We use the standard SGD algorithm with learning rate 0.01 for the Olivetti dataset and 0.05 for the HCP dataset. We use a cross entropy loss. We ran experiments for logistic regression long enough (200 epochs for Olivetti and 500 epochs for HCP and HCP-small) to guarantee convergence. We applied early stopping on MLP and CNN architectures when the validation loss stopped improving in 10 (also known as patience parameter) subsequent epochs. We repeated each experiment with 10 different random initializations.