Testing for Families of Distributions via the Fourier Transform
Authors: Clément L. Canonne, Ilias Diakonikolas, Alistair Stewart
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | The main contribution of this work is a simple and general testing technique that is applicable to all distribution families whose Fourier spectrum satisfies a certain approximate sparsity property. We apply our Fourier-based framework to obtain near sample-optimal and computationally efficient testers for the following fundamental distribution families: Sums of Independent Integer Random Variables (SIIRVs), Poisson Multinomial Distributions (PMDs), and Discrete Log-Concave Distributions. For the first two, ours are the first non-trivial testers in the literature, vastly generalizing previous work on testing Poisson Binomial Distributions. For the third, our tester improves on prior work in both sample and time complexity. We prove: Theorem 1 (Testing SIIRVs). Given parameters k, n N and sample access to a distribution over N, there exists an algorithm (Algorithm 1) for T(SIIRVn,k, ϵ) which takes samples, and runs in time n(k/ϵ)O(k log(k/ϵ)). |
| Researcher Affiliation | Academia | Clément L. Canonne Stanford University ccanonne@stanford.edu Ilias Diakonikolas University of Southern California diakonik@usc.edu Alistair Stewart University of Southern California stewart.al@gmail.com |
| Pseudocode | Yes | Algorithm 1 Testing the Fourier Transform Effective Support; Algorithm 2 Tolerant L2 identity tester |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code, nor does it include links to a code repository or mention code in supplementary materials. |
| Open Datasets | No | This paper is theoretical and focuses on algorithm design and complexity analysis for distribution testing, without conducting experiments on specific datasets. Therefore, it does not mention any publicly available datasets or provide access information. |
| Dataset Splits | No | The paper is theoretical and does not involve empirical evaluation with data. Therefore, it does not provide information about training/validation/test dataset splits. |
| Hardware Specification | No | The paper is theoretical and focuses on algorithms and complexity analysis; it does not describe any experimental setup or report on experiments that would require specific hardware specifications. |
| Software Dependencies | No | The paper is theoretical and describes algorithms and proofs; it does not report on any software implementations or list software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and focuses on algorithm design and complexity analysis; it does not describe specific experimental setup details, hyperparameters, or training configurations. |