reproducibilityindex.ai

Fast Deterministic CUR Matrix Decomposition with Accuracy Assurance

Authors: Yasutoshi Ida, Sekitoshi Kanai, Yasuhiro Fujiwara, Tomoharu Iwata, Koh Takeuchi, Hisashi Kashima

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluated the processing times and values of the objectives. We compared our method with the original deterministic CUR (origin) (Bien et al., 2010), the sequential strong rule (SSR) (Tibshirani et al., 2012), and the sequential strong rule with warm start (SSR+WS) (Ndiaye et al., 2017). We tuned λ for all the approaches based on the sequential rule by following the methods in (Tibshirani et al., 2012; Wang & Ye, 2014; Ndiaye et al., 2017; Ida et al., 2019). The search space was a non-increasing sequence of Q parameters (λq)Q 1 q=0 deﬁned as λq = λmax10 γq/Q 1. λmax is the smallest λ for which all the parameters are zeros at the optimal solutions and it was computed by following (Tibshirani et al., 2012). We used γ = 4 and Q = 100 (Wang & Ye, 2014; Ndiaye et al., 2017; Ida et al., 2019). We stopped the algorithm for each λq when the relative tolerance of the parameter matrix dropped below 10 5 for all the approaches (Johnson & Guestrin, 2016; 2017; Ida et al., 2019). We stopped the sequential rule when all the parameters were nonzeros since our purpose is to select the subset of the columns. We conducted the experiments on ﬁve datasets from the LIBSVM2 (Chang & Lin, 2011) and Open ML3 (Vanschoren et al., 2013) websites, namely colon-cancer, Bioresponse, QSAR-TID-52, Madelon, and Slashdot, and the sizes of the data matrices were 62 2000, 3751 1776, 877 1024, 2000 500, and 3782 1079, respectively. Each experiment was conducted with one CPU core and 264 GB of main memory on a 2.20 GHz Intel Xeon server running Linux.
Researcher Affiliation	Collaboration	Yasutoshi Ida 1 2 Sekitoshi Kanai 1 Yasuhiro Fujiwara 3 Tomoharu Iwata 3 Koh Takeuchi 2 4 Hisashi Kashima 2 4 1NTT Software Innovation Center, Tokyo, Japan 2Department of Intelligence Science and Technology, Kyoto University, Kyoto, Japan 3NTT Communication Science Laboratories, Kyoto, Japan 4RIKEN Center for Advanced Intelligence Project, Tokyo, Japan.
Pseudocode	Yes	Algorithm 1 shows the pseudocode of coordinate descent. ... Algorithm 2 shows the pseudocode of our algorithm, namely Fast Deterministic CUR, which utilizes the abovementioned deﬁnitions and lemmas.
Open Source Code	No	The paper does not provide any links to its source code or state that it is openly available.
Open Datasets	Yes	We conducted the experiments on ﬁve datasets from the LIBSVM2 (Chang & Lin, 2011) and Open ML3 (Vanschoren et al., 2013) websites, namely colon-cancer, Bioresponse, QSAR-TID-52, Madelon, and Slashdot
Dataset Splits	No	The paper does not explicitly provide details about train/validation/test splits of the datasets used for reproduction, beyond stating it tuned regularization parameters.
Hardware Specification	Yes	Each experiment was conducted with one CPU core and 264 GB of main memory on a 2.20 GHz Intel Xeon server running Linux.
Software Dependencies	No	The paper mentions 'LIBSVM' but does not specify version numbers for any software dependencies or libraries used in their implementation.
Experiment Setup	Yes	We tuned λ for all the approaches based on the sequential rule by following the methods in (Tibshirani et al., 2012; Wang & Ye, 2014; Ndiaye et al., 2017; Ida et al., 2019). The search space was a non-increasing sequence of Q parameters (λq)Q 1 q=0 deﬁned as λq = λmax10 γq/Q 1. λmax is the smallest λ for which all the parameters are zeros at the optimal solutions and it was computed by following (Tibshirani et al., 2012). We used γ = 4 and Q = 100 (Wang & Ye, 2014; Ndiaye et al., 2017; Ida et al., 2019). We stopped the algorithm for each λq when the relative tolerance of the parameter matrix dropped below 10 5 for all the approaches (Johnson & Guestrin, 2016; 2017; Ida et al., 2019).