Sequential Algorithms for Testing Closeness of Distributions
Authors: Aadil Oufkir, Omar Fawzi, Nicolas Flammarion, Aurélien Garivier
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Figure 1: Left: histogram of the stopping times for 100 Monte-Carlo experiments. Black: D1 = D2 = Un, blue (resp. magenta): D1 = Un and D2 = {(1 2ε)/n} (resp. {(1 4ε)/n}). Right: D1 = U2 and D2 = {(1 2ε)/2}. The sequential tester stops as soon as the statistic enters the red region (for H1) or blue region (for H2) whereas the batch tester waits for the red and blue regions to cover the whole segment [0, 1]. The blue/red and black dashed lines represent respectively the stopping times of the sequential and batch algorithms. We note that, in both cases, the sequential tester stops long before the batch algorithm. |
| Researcher Affiliation | Academia | Omar Fawzi Univ Lyon, ENS Lyon, UCBL CNRS, Inria, LIP, F-69342 Lyon Cedex 07, France omar.fawzi@ens-lyon.fr Nicolas Flammarion EPFL Lausanne, Switzerland nicolas.flammarion@epfl.ch Aurélien Garivier UMPA UMR 5669 and LIP UMR 5668 CNRS, ENS de Lyon, UCB Lyon 1 Lyon, France aurelien.garivier@ens-lyon.fr Aadil Oufkir LIP UMR 5668 CNRS ENS de Lyon, UCB Lyon 1 Lyon, France aadil.oufkir@ens-lyon.fr |
| Pseudocode | Yes | Algorithm 1 Distinguish between D1 = D2 and TV(D1, D2) > ε with high probability... Algorithm 2 Distinguish between D1 = D2 and TV(D1, D2) > ε with high probability |
| Open Source Code | No | The paper does not provide any links to source code repositories or explicitly state that the code for the described methodology is publicly available. |
| Open Datasets | No | The paper conducts theoretical analysis and simulations based on sampling from defined distributions (D1, D2), not using a traditional publicly available dataset with a specific name or access information. |
| Dataset Splits | No | The paper does not discuss standard training, validation, or test dataset splits in the context of empirical model evaluation. |
| Hardware Specification | No | The paper does not mention any specific hardware (e.g., CPU, GPU models, cloud platforms) used for running the simulations or experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies or their version numbers used for the implementation or experiments. |
| Experiment Setup | No | The paper describes the algorithms and their theoretical properties but does not provide specific hyperparameter values or detailed system-level training settings for the Monte-Carlo experiments beyond the distributions themselves and the error tolerance (epsilon, delta). |