reproducibilityindex.ai

How to Fake Multiply by a Gaussian Matrix

Authors: Michael Kapralov, Vamsi Potluru, David Woodruff

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments. We empirically validate our results for both NMF and SVM applications. For NMF, we give an experimental evaluation by comparing with state-of-the-art algorithms such as SPA (Gillis et al., 2014), XRAY (Kumar et al., 2013), na ıve random projections (Damle and Sun, 2014) , structured Gaussian random projections (Tepper and Sapiro, 2015), and Tall-Skinny QR factorization (Benson et al., 2014) for NMF problems with applications to breast cancer, ﬂow cytometry, climate data and movie analysis. Also, we show experimental speedups using our projection when combined with linear SVM solvers for document classiﬁcation problems (Paul et al., 2014).
Researcher Affiliation	Collaboration	Michael Kapralov MICHAEL.KAPRALOV@EPFL.CH EPFL, Lausanne, Switzerland Vamsi K. Potluru VAMSI_POTLURU@CABLE.COMCAST.COM Comcast Cable, Washington DC, USA 20005 David P. Woodruff DPWOODRU@US.IBM.COM IBM Research, Almaden, San Jose, CA USA
Pseudocode	Yes	Algorithm 1 Count Gauss NMF (CG) Initialize the index sets Imax, Imin to empty.
Open Source Code	Yes	In all our experiments1, we set B = 5m. [...] 1https://github.com/marinkaz/nimfa
Open Datasets	Yes	Gene expression breast cancer dataset. We utilize the hereditary breast cancer dataset collected by Hedenfalk et al. (2001) which consists of the expression levels of 3226 genes on 22 samples from breast cancer patients. [...] Tech TC-300 Dataset. We obtained the Tech TC300 dataset which is a comprehensive directory of the web. There are 295-pairs of categories which provides a rich framework for running SVM experiments (Paul et al., 2014).
Dataset Splits	Yes	The results are shown over 10-fold cross validation with 4 repetitions and 3 runs over the random projection matrices.
Hardware Specification	No	The paper states that certain calculations “can be solved in a couple of seconds on an off-the-shelf desktop,” but this is too vague to be a specific hardware specification. No detailed hardware information (e.g., CPU, GPU models, memory) used for experiments is provided.
Software Dependencies	No	The paper mentions that “LIBSVM was used with a linear kernel” for SVM experiments. However, it does not specify the version number of LIBSVM or any other software dependencies.
Experiment Setup	Yes	In all our experiments1, we set B = 5m. [...] LIBSVM was used with a linear kernel and soft-margin parameter C set to 500 for all experiments and we set the projections to 128, 256, and 512.