Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks

Authors: Tolga Birdal, Aaron Lou, Leonidas J. Guibas, Umut Simsekli

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that the proposed approach can efficiently compute a network s intrinsic dimension in a variety of settings, which is predictive of the generalization error. This section presents our experimental results in two parts: (i) analyzing and quantifying generalization in practical deep networks on real data, (ii) ablation studies on a random diffusion process.
Researcher Affiliation Academia Tolga Birdal Stanford University tbirdal@stanford.edu Aaron Lou Stanford University aaronlou@stanford.edu Leonidas Guibas Stanford University guibas@cs.stanford.edu Umut Sim sekli INRIA & ENS PSL Research University umut.simsekli@inria.fr
Pseudocode Yes Algorithm 1: Computation of dim PH. 1 input :The set of iterates W = {wi}K i=1, smallest sample size nmin, and a skip step , α 2 output :dim PHW 3 n nmin, E [] 4 while n K do 5 Wn sample(W, n) // random sampling 6 Wn VR(Wn) // Vietoris-Rips filtration 7 E[i] Eα(Wn) P γ PH0(Wn) |I(γ)|α // compute lifetime sums from PH 9 m, b fitline (log(nmin : : K), log(E)) // power law on Ei 1(W) 10 dim PHW α 1 m
Open Source Code Yes To foster further developments at the the intersection of persistent homology and statistical learning theory, we release our source code under: https://github.com/tolgabirdal/PHDim_Generalization.
Open Datasets Yes In particular, we train Alex Net [KSH12], a 5-layer (fcn-5) and 7-layer (fcn-7) fully connected networks, and a 9-layer convolutional netowork (cnn-9) on MNIST, CIFAR10 and CIFAR100 datasets for multiple batch sizes and learning rates until convergence.
Dataset Splits No The paper states it trains models until convergence and measures generalization as the gap between training and test accuracies, but does not explicitly detail training, validation, and test dataset splits.
Hardware Specification No The paper mentions using PyTorch for implementation but does not provide specific details about the hardware (e.g., GPU model, CPU, memory) used for running the experiments.
Software Dependencies No The paper mentions using PyTorch [PGM+19] and the associated persistent homology package torchph [CHU17, CHN19] but does not specify their version numbers.
Experiment Setup Yes In particular, we train Alex Net [KSH12], a 5-layer (fcn-5) and 7-layer (fcn-7) fully connected networks, and a 9-layer convolutional netowork (cnn-9) on MNIST, CIFAR10 and CIFAR100 datasets for multiple batch sizes and learning rates until convergence. For Alex Net, we consider 1000 iterates prior to convergence and, for the others, we only consider 200. ... We train a Lenet-5 network [LBBH98] on Cifar10 [Kri09] and compare a clean trianing with a training with our topological regularizer with λ set to 1. We train for 200 epochs with a batch size of 128 and report the train and test accuracies in Fig. 5 over a variety of learning rates.