Robust low-rank training via approximate orthonormal constraints

Authors: Dayana Savostianova, Emanuele Zangrando, Gianluca Ceruti, Francesco Tudisco

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This is shown by extensive numerical evidence and by our main approximation theorem that shows the computed robust low-rank network well-approximates the ideal full model, provided a highly performing low-rank sub-network exists. ... We provide several experimental evaluations on different architectures and datasets, where the robust low-rank networks are compared against a variety of baselines.
Researcher Affiliation Academia Dayana Savostianova Gran Sasso Science Institute 67100 L Aquila (Italy) dayana.savostianova@gssi.it Emanuele Zangrando Gran Sasso Science Institute 67100 L Aquila (Italy) emanuele.zangrando@gssi.it Gianluca Ceruti University of Innsbruck 6020 Innsbruck (Austria) gianluca.ceruti@uibk.ac.at Francesco Tudisco Gran Sasso Science Institute 67100 L Aquila (Italy) francesco.tudisco@gssi.it
Pseudocode Yes Algorithm 1: Pseudocode of robust well-Conditioned Low-Rank (Cond LR ) training scheme
Open Source Code Yes All the experiments can be reproduced with the code in Py Torch available at https://github.com/COMPi LELab/Cond LR.
Open Datasets Yes We consider MNIST , CIFAR10, and CIFAR100 [33] datasets for evaluation purposes. ... [33] A. Krizhevsky, G. Hinton, et al. Learning multiple layers of features from tiny images. 2009.
Dataset Splits No The paper mentions "60,000 training images" and "10,000 test images" for MNIST, and similar counts for CIFAR10/100, but does not specify any explicit validation dataset splits (e.g., percentages or exact counts for a validation set).
Hardware Specification No The paper does not specify the exact hardware (e.g., CPU, GPU models, cloud instances) used for running the experiments.
Software Dependencies No The paper mentions "Py Torch" as the framework used for the code, but does not provide a specific version number for PyTorch or any other software dependencies.
Experiment Setup Yes Each method and model was trained for 120 epochs of stochastic gradient descent with a minibatch size of 128. We used a learning rate of 0.1 for Le Net5 and 0.05 for VGG16 with momentum 0.3 and 0.45, respectively, and a learning rate scheduler with factor = 0.3 at 70 and 100 epochs.