Optimal Robust Learning of Discrete Distributions from Batches

Authors: Ayush Jain, Alon Orlitsky

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The algorithm s computational efficiency, enables the first experiments of learning with adversarial batches. We tested the algorithm on simulated data with various adversarialbatch distributions and adversarial noise levels up to β = 0.49. In Section 3 reports the performance of the algorithm on experiments performed on the simulated data.
Researcher Affiliation Academia 1University of California, San Diego. Correspondence to: Ayush Jain <ayjain@eng.ucsd.edu>.
Pseudocode Yes Algorithm 1 Batch Deletion; Algorithm 2 Robust Distribution Estimator
Open Source Code Yes We provide all codes and implementation details in the supplementary material.
Open Datasets No We evaluate the algorithm s performance on synthetic data. The paper mentions synthetic data but does not provide concrete access information (link, DOI, repository, or formal citation with authors/year) for generating or obtaining this dataset.
Dataset Splits No The paper evaluates the algorithm on synthetic data and runs multiple trials, but it does not specify explicit training, validation, or test dataset splits, nor does it describe a cross-validation setup.
Hardware Specification Yes All experiments were performed on a laptop with a configuration of 2.3 GHz Intel Core i7 CPU and 16 GB of RAM.
Software Dependencies No The paper mentions "We provide all codes and implementation details in the supplementary material," but it does not specify version numbers for any software dependencies, libraries, or solvers used in the implementation.
Experiment Setup Yes For the first plot we fix batch-size n = 1000 and β = 0.4 and vary alphabet size k. We generate m = k/(0.4)2 batches for each k. In the the second plot we fix β = 0.4 and k = 200 and vary batch size n. We choose m = 40 k β2 1000 n , this keeps the total number of samples n m, constant for different n. For the next plot we fix batch size n = 1000 and k = 200.