reproducibilityindex.ai

M-FAC: Efficient Matrix-Free Approximations of Second-Order Information

Authors: Elias Frantar, Eldar Kurtic, Dan Alistarh

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 Experimental ValidationFor pruning, our implementation provides order-of-magnitude improvements over the block-wise approximation of [37] for classic benchmarks such as pruning Res Net50 and Mobile Net on the Image Net dataset. ... What is more, our preconditioned SGD (even without momentum) can be competitive in terms of validation accuracy with state-of-the-art optimizers on models of moderate size, including compact vision architectures and Transformer language models [42]. Its computational overheads are of 5% 55% relative to vanilla SGD on standard CNN architectures.
Researcher Affiliation	Collaboration	Elias Frantar IST Austria elias.frantar@ist.ac.at Eldar Kurtic IST Austria eldar.kurtic@ist.ac.at Dan Alistarh IST Austria & Neural Magic dan.alistarh@ist.ac.at
Pseudocode	No	The paper describes the algorithms in prose and equations but does not include a formally labeled pseudocode or algorithm block.
Open Source Code	Yes	Implementations are available at [9] and [17].
Open Datasets	Yes	We prune CNNs (Res Net-50 [15] and Mobile Net-V1 [16]) on the Image Net dataset [36].
Dataset Splits	Yes	We prune CNNs (Res Net-50 [15] and Mobile Net-V1 [16]) on the Image Net dataset [36].
Hardware Specification	Yes	Timing experiments are run on a machine with NVIDIA RTX 2080 Ti GPUs, a 48-core Intel CPU, and 512 GB of RAM.
Software Dependencies	No	Pytorch [34] implementations of a pruning and optimization library.Tensor Flow [1] is mentioned, but specific version numbers for these software dependencies are not provided in the main text.
Experiment Setup	Yes	Following [37], we used batched gradients (of size 16) as single samples inside the Fisher approximation.