No Free Prune: Information-Theoretic Barriers to Pruning at Initialization

Authors: Tanishq Kumar, Kevin Luo, Mark Sellke

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on neural networks confirm that information gained during training may indeed affect model capacity.
Researcher Affiliation Academia 1Harvard University.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper states 'We used the pruning code of (Tanaka et al., 2020)' which refers to third-party code, but does not provide specific access to source code developed for this paper's methodology.
Open Datasets Yes We consider a two-hidden layer network with Re LU activation, on a train set of points (Gaussian data in Figure 1(a) and Fashion MNIST in Figure 1(b)), as well as a conv Net on noisy CIFAR-10 in 1(c)...
Dataset Splits No The paper mentions using a 'train set' but does not specify explicit training, validation, or test dataset splits, or reference standard splits that define these proportions.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory) used for running its experiments.
Software Dependencies No The paper mentions using 'Adam' as an optimizer and 'Re LU activation' and refers to 'the pruning code of (Tanaka et al., 2020)', but it does not list specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup Yes We train till convergence in loss to within 0.01 (or until accuracy doesn t change for three consecutive epochs), with η = 1e-3 on Adam. We use a batch size of 64 with a two-hidden layer Re LU architecture with a hidden width of 200.