Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Computing Approximate $\ell_p$ Sensitivities

Authors: Swati Padmanabhan, David Woodruff, Richard Zhang

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We consolidate our theoretical contributions with a demonstration of the empirical advantage of our approximation algorithms over the naive approach of estimating these sensitivities using the UCI Machine Learning Dataset Repository [46] in Section 4. We found that many datasets have extremely small total sensitivities; therefore, fast sensitivity approximation algorithms utilize this small intrinsic dimensionality of real-world data far better than Lewis weights sampling, with sampling bounds often a factor of 2-5x better. We also found these sensitivities to be easy to approximate, with the accuracy-runtime tradeoff much better than our theory suggests, with up to 40x faster in runtime. Lastly, we show that our algorithm for estimating the total sensitivity produces accurate estimates, up to small constant factors, with a runtime speedup of at least 4x.
Researcher Affiliation Collaboration Swati Padmanabhan Massachusetts Institute of Technology EMAIL David P. Woodruff Carnegie Mellon University EMAIL Qiuyi (Richard) Zhang Google Research EMAIL
Pseudocode Yes Algorithm 1 Approximating ℓ1-Sensitivities: Row-wise Approximation; Algorithm 2 Approximating the Sum of ℓp-Sensitivities for p 1; Algorithm 3 Approximating the Maximum of ℓ1-Sensitivities; Algorithm 4 Approximating the Sum of ℓ1-Sensitivities; Algorithm 5 Subroutine for Approximating the Sum of ℓ1-Sensitivities; Algorithm 6 Approximating ℓp-Sensitivities: Row-wise Approximation; Algorithm 7 Approximating the Maximum of ℓp-Sensitivities
Open Source Code No The paper mentions using the UCI Machine Learning Dataset Repository [46] but does not provide any statement or link for the source code of their own methods.
Open Datasets Yes We demonstrate our fast sensitivity approximations on multiple real-world datasets in the UCI Machine Learning Dataset Repository [46], such as the wine and fires datasets, for different p and varying approximation parameters α. We focus on the wine dataset here, with full details in Appendix E.
Dataset Splits No The paper uses datasets from the UCI Machine Learning Dataset Repository but does not explicitly mention or describe specific training, validation, or test splits for these datasets within the context of their experiments.
Hardware Specification No The paper states "We demonstrate our fast sensitivity approximations on multiple real-world datasets..." but does not provide any details about the specific hardware (CPU, GPU, memory, etc.) used for these experiments.
Software Dependencies No The paper describes algorithms and experiments but does not provide specific version numbers for software dependencies or libraries used in the implementation (e.g., "Python 3.x", "NumPy 1.x").
Experiment Setup No Section 4, titled "Numerical experiments", describes the datasets and metrics used, but does not provide specific experimental setup details such as hyperparameters (e.g., learning rates, batch sizes, epochs), model initialization, or specific optimizer settings.