Reweighted Solutions for Weighted Low Rank Approximation

Authors: David Woodruff, Taisuke Yasuda

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the empirical performance of our WLRA algorithms through experiments for model compression tasks. We first show in Section 5.1 that the importance matrices arising this application are indeed very low rank. We may interpret this phenomenon intuitively: we hypothesize that the importance score of some parameter A๐‘–,๐‘—is essentially the product of the importance of the corresponding input ๐‘–and the importance of the corresponding output ๐‘—. This observation may be of independent interest, and also empirically justifies the low rank weight matrix assumption that we make in this work, as well as works of (Razenshteyn et al., 2016; Ban et al., 2019). Next in Section 5.2, we conduct experiments which demonstrate the superiority of our methods in practice.
Researcher Affiliation Academia 1School of Computer Science, Carnegie Mellon University, Pittsburgh, PA.
Pseudocode Yes Algorithm 1 Weighted low rank approximation input: input matrix A R๐‘› ๐‘‘, non-negative weights W R๐‘› ๐‘‘with rank ๐‘Ÿ, rank parameter ๐‘˜. output: approximate solution A. 1: Compute a rank ๐‘Ÿ๐‘˜approximation AW of W A 2: Return A := W 1 AW
Open Source Code Yes All code used in the experiments are available in the supplementary.
Open Datasets Yes We train a basic multilayer perceptron (MLP) on four image datasets, mnist, fashion mnist, smallnorb, and colorectal histology which were selected from the tensorflow datasets catalogue for simplicity of processing (e.g., fixed feature size, no need for embeddings, etc.).
Dataset Splits No The paper mentions training and testing but does not explicitly detail validation dataset splits (percentages or sample counts) or cross-validation setup.
Hardware Specification Yes Our experiments are conducted on a 2019 Mac Book Pro with a 2.6 GHz 6-Core Intel Core i7 processor.
Software Dependencies No The paper mentions using "tensorflow library" but does not provide specific version numbers for TensorFlow or any other software dependencies.
Experiment Setup Yes This was run for 100 epochs, with an initial learning rate of 1.0 decayed by a factor of 0.7 every 10 steps. In the experiments, we run 25 iterations.