TURF: Two-Factor, Universal, Robust, Fast Distribution Learning Algorithm
Authors: Yi Hao, Ayush Jain, Alon Orlitsky, Vaishakh Ravindrakumar
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments combining the two techniques confirm improved performance over existing methodologies. [...] All experiments compare the ℓ1 error, run for n between 1,000 and 80,000, and averaged over 50 runs. |
| Researcher Affiliation | Academia | Yi Hao 1 Ayush Jain 1 Alon Orlitsky 1 Vaishakh Ravindrakumar 1 1Electrical and Computer Engineering, University of California, San Diego. Correspondence to: Vaishakh Ravindrakumar <varavind@ucsd.edu>. |
| Pseudocode | Yes | Algorithm 1 TURF Input: Xn, t, d, α k ← 8c1(d + 1)/α {c1 is the constant in Lemma 9} β ← 1 + 4k/(α(d + 1)) ηd ← p (d + 1)/n IADLS, f adls I , I ∈ IADLS ← ADLS(Xn, t, d, β, ηd) Output: f out t,d,α ← f adj I,f adls I ,d,k, I ∈ IADLS |
| Open Source Code | No | The paper does not contain any statement about making its own source code publicly available or provide a link to a code repository. It mentions using 'the code provided in (Acharya et al., 2017)' for ADLS, which is a third-party tool. |
| Open Datasets | No | The paper describes the generation of synthetic data (e.g., mixtures of Beta, Gamma, and Gaussian distributions, and perturbed versions), but does not use or provide access to any specific publicly available datasets. For instance, it describes 'mixtures of Beta: .4B(.8, 4)+.6B(2, 2)' and how they perturb these distributions. |
| Dataset Splits | No | The paper discusses 'cross-validation techniques' for parameter selection, but it does not specify how the empirical data itself is split into training, validation, or test sets with percentages or sample counts. The experiments are based on drawing samples from described distributions. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions) that would be needed to replicate the experiments. It only mentions using 'the code provided in (Acharya et al., 2017)' for ADLS. |
| Experiment Setup | Yes | Algorithm 1 TURF provides parameters (t, d, α, k, β, ηd) used in the algorithm. Section 7, 'Experiments', states: 'run for n between 1,000 and 80,000, and averaged over 50 runs.' It also specifies parameters for synthetic data generation: 'We choose k = 100 and c2 = 0.05, 1, 0.1 for the Beta, Gamma and Gaussian mixtures respectively, to yield the distributions shown in Figure 1.' It also notes 'd=1' and 'd=2' for figures. |