reproducibilityindex.ai

Feature Grouping as a Stochastic Regularizer for High-Dimensional Structured Data

Authors: Sergul Aydore, Bertrand Thirion, Gael Varoquaux

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on two real-world datasets demonstrate that the proposed approach produces models that generalize better than those trained with conventional regularizers, and also improves convergence speed, and has a linear computational cost.
Researcher Affiliation	Academia	1Stevens Institute of Technology, New Jersey, USA 2Inria Saclay, Palaiseau, France. Correspondence to: Sergul Aydore <sergulaydore@gmail.com>.
Pseudocode	Yes	Algorithm 1 Training of a Neural Network with Feature Grouping as a Stochastic Regularizer
Open Source Code	Yes	Our implementation is openly available at https://github.com/sergulaydore/ Feature-Grouping-Regularizer.
Open Datasets	Yes	Olivetti Faces: The Olivetti dataset consists of grayscale 64 64 face images from 40 subjects (HOPPER, 1992). HCP: The Human Connectome Project (HCP) has released a large openly-accessible f MRI dataset. Here we use task f MRI that includes seven tasks: 1. Working Memory, 2. Gambling, 3. Motor, 4. Language 5. Social Cognition, 6. Relational Processing, and 7. Emotion Processing. These tasks have been chosen to map different brain systems. The dataset includes 500 different subjects with images registered to the standard MNI atlas. For a given subject and task, a GLM was ﬁtted to each f MRI dataset (Barch et al., 2013).
Dataset Splits	No	The paper mentions using "validation loss" for early stopping ("We applied early stopping on MLP and CNN architectures when the validation loss stopped improving in 10 (also known as patience parameter) subsequent epochs.") but does not specify the size, percentage, or methodology for creating the validation set split.
Hardware Specification	Yes	Experiments are run using Nvidia Ge Force GTX 1060 and 16GB RAM.
Software Dependencies	Yes	We use Python 3.6 for implementation (Oliphant, 2007) using open-source libraries Py Torch (Paszke et al., 2017), scikit-learn (Pedregosa et al., 2011), Ni Babel (Brett et al., 2016), nilearn (Abraham et al., 2014), joblib (Varoquaux & Grisel, 2009) and Num Py (Walt et al., 2011).
Experiment Setup	Yes	We use the standard SGD algorithm with learning rate 0.01 for the Olivetti dataset and 0.05 for the HCP dataset. We use a cross entropy loss. We ran experiments for logistic regression long enough (200 epochs for Olivetti and 500 epochs for HCP and HCP-small) to guarantee convergence. We applied early stopping on MLP and CNN architectures when the validation loss stopped improving in 10 (also known as patience parameter) subsequent epochs. We repeated each experiment with 10 different random initializations.