Private and Non-private Uniformity Testing for Ranking Data
Authors: Róbert Busa-Fekete, Dimitris Fotakis, Emmanouil Zampetakis
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We carry out large-scale experiments, including m = 10, 000, to show that our uniformity testing algorithms scale gracefully with m. and 7 Experiments We shall present synthetic experiments to assess the performance of the proposed tests. |
| Researcher Affiliation | Collaboration | Róbert Busa-Fekete Google Research, New York, USA busarobi@google.com Dimitris Fotakis National Technical University of Athens, Greece fotakis@cs.ntua.gr Manolis Zampetakis University of California, Berkeley, USA mzampet@berkeley.edu |
| Pseudocode | Yes | Algorithm 1 2SAMP: Uniformity Test with Two Samples, Algorithm 2 Uniformity Test (UNIF), Algorithm 3 Central DP Uniformity Test (TRUN), Algorithm 5, Algorithm 6 |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] |
| Open Datasets | No | We shall present synthetic experiments to assess the performance of the proposed tests. and We used synthetic data. No information provided for public access to the synthetic data itself or the exact generation process to reproduce it as a dataset. |
| Dataset Splits | No | The paper discusses sample complexity for statistical tests and uses synthetic data, but does not mention any training, validation, or test dataset splits. |
| Hardware Specification | No | Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [N/A] We used data centers to compute the experiments. I believe that it is not so relevant to this work how long the computation did take. |
| Software Dependencies | No | The paper does not provide specific software dependencies or version numbers for the key software components used in the experiments. |
| Experiment Setup | Yes | Every testing algorithm we presented has a tolerance parameter and significance δ. We used δ = 0.05 in every case. The tolerance parameter does have impact only on the sample size of the testing algorithms. Instead of setting to a certain value, we plotted the power of the algorithms with various sample size. In this way, we could compare the performance of the testing algorithms based on the same number of samples as input. Each result we report here are computed based on 1000 repetitions. The central ranking of each model which the random samples are generated from, is selected uniformly at random in each each run independently. |