Turnstile $\ell_p$ leverage score sampling with applications
Authors: Alexander Munteanu, Simon Omlor
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compare experimentally to plain oblivious sketching and plain leverage score sampling algorithms for ℓp and logistic regression. |
| Researcher Affiliation | Academia | 1Dortmund Data Science Center, Faculties of Statistics and Computer Science, TU Dortmund University, Dortmund, Germany 2Faculty of Statistics, TU Dortmund University, Dortmund, Germany 3Lamarr-Institute for Machine Learning and Artificial Intelligence, Dortmund, Germany. Correspondence to: Alexander Munteanu <alexander.munteanu@tudortmund.de>, Simon Omlor <simon.omlor@tu-dortmund.de>. |
| Pseudocode | Yes | Algorithm 1 Finding ℓp heavy hitters. |
| Open Source Code | Yes | Our new and combined code is available at https://github.com/Tim907/turnstile-sampling. |
| Open Datasets | Yes | The following real-world datasets have become standard baselines to measure the performance of data reduction algorithms for logistic regression and ℓ1 regression: Covertype, Webspam, and KDDCup, see Appendix I.2 for details. |
| Dataset Splits | No | For each dataset, and each of the two problems, we first solve the original large instance to optimality to obtain zopt. We then run the data reduction algorithms, for varying target coreset resp. sketch sizes, and solve the reduced and reweighted problem to optimality to obtain the approximation z. For each target size, we repeat this process 21 times and plot in Figure 1 the median of the resulting approximation ratios f( z)/f(zopt). |
| Hardware Specification | Yes | All experiments were run on a workstation with AMD Ryzen Threadripper PRO 5975WX, 32 cores at 3.6GHz, 512GB DDR4-3200. |
| Software Dependencies | No | We experienced convergence problems using the scipy optimizer for the non-differentiable ℓ1 loss. |
| Experiment Setup | No | For each dataset, and each of the two problems, we first solve the original large instance to optimality to obtain zopt. We then run the data reduction algorithms, for varying target coreset resp. sketch sizes, and solve the reduced and reweighted problem to optimality to obtain the approximation z. For each target size, we repeat this process 21 times and plot in Figure 1 the median of the resulting approximation ratios f( z)/f(zopt). |