GPU-Accelerated Primal Learning for Extremely Fast Large-Scale Classification
Authors: John T. Halloran, David M. Rocke
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | All experiments were run on a dual Intel Xeon Gold 5118 compute node with 48 computational threads, an NVIDIA Tesla V100 GPU, and 768 GB of memory. Speedups for sparse features. The TRON-LR GPU-optimized and mixed-architecture solvers (described in Section 4.2) are referred to as TRON-LR-GPU and TRON-LR-MIX, respectively. |
| Researcher Affiliation | Academia | John T. Halloran Department of Public Health Sciences University of California, Davis jthalloran@ucdavis.edu David M. Rocke Department of Public Health Sciences University of California, Davis dmrocke@ucdavis.edu |
| Pseudocode | Yes | Algorithm 1 The TRON algorithm |
| Open Source Code | No | The paper mentions that their implementations were "developed based on LIBLINEAR v2.30" but does not provide an explicit statement or link confirming that their specific GPU-optimized code is open-source or publicly available. |
| Open Datasets | Yes | Six datasets of varying statistics (i.e., number of features, instances, and nonzero elements) were downloaded from https://www.csie.ntu.edu.tw/ cjlin/libsvmtools/datasets/ and used to benchmark the TRON-LR solvers (statistics for each dataset are listed in [23]). |
| Dataset Splits | Yes | Furthermore, to prevent overfitting and improve generalizability within each iteration, three-fold cross-validation is carried out over three disjoint partitions of the original dataset, followed by further nested cross-validation within each fold [16]. |
| Hardware Specification | Yes | All experiments were run on a dual Intel Xeon Gold 5118 compute node with 48 computational threads, an NVIDIA Tesla V100 GPU, and 768 GB of memory. |
| Software Dependencies | Yes | TRON-LR-GPU, TRON-LR-MIX, and TRON-LR-GPU0 were all developed based on LIBLINEAR v2.30. Single-threaded LIBLINEAR tests were run using v2.30. The multithread-optimized version of TRON-LR described in [36], referred to herein as TRON-LR-CPU, was tested using multi-core LIBLINEAR v2.30. The original Percolator SVM learning runtimes (collected using Percolator v3.04.0) were 14.4 hours and 4.4. days for the Kim and Wilhelm datasets, respectively. |
| Experiment Setup | Yes | All single-threaded TRON-LR implementations (i.e., TRON-LR-GPU0, TRONLR-GPU, and the single-threaded optimized version of TRON-LR in standard LIBLINEAR) were run with the same command line parameters: -c 4 -e 0.1 -s 0. Multithreaded implementations were run with the additional flag -nr i, specifying the use of i compute threads. |