reproducibilityindex.ai

TURF: Two-Factor, Universal, Robust, Fast Distribution Learning Algorithm

Authors: Yi Hao, Ayush Jain, Alon Orlitsky, Vaishakh Ravindrakumar

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments combining the two techniques conﬁrm improved performance over existing methodologies. [...] All experiments compare the ℓ1 error, run for n between 1,000 and 80,000, and averaged over 50 runs.
Researcher Affiliation	Academia	Yi Hao 1 Ayush Jain 1 Alon Orlitsky 1 Vaishakh Ravindrakumar 1 1Electrical and Computer Engineering, University of California, San Diego. Correspondence to: Vaishakh Ravindrakumar <varavind@ucsd.edu>.
Pseudocode	Yes	Algorithm 1 TURF Input: Xn, t, d, α k ← 8c1(d + 1)/α {c1 is the constant in Lemma 9} β ← 1 + 4k/(α(d + 1)) ηd ← p (d + 1)/n IADLS, f adls I , I ∈ IADLS ← ADLS(Xn, t, d, β, ηd) Output: f out t,d,α ← f adj I,f adls I ,d,k, I ∈ IADLS
Open Source Code	No	The paper does not contain any statement about making its own source code publicly available or provide a link to a code repository. It mentions using 'the code provided in (Acharya et al., 2017)' for ADLS, which is a third-party tool.
Open Datasets	No	The paper describes the generation of synthetic data (e.g., mixtures of Beta, Gamma, and Gaussian distributions, and perturbed versions), but does not use or provide access to any specific publicly available datasets. For instance, it describes 'mixtures of Beta: .4B(.8, 4)+.6B(2, 2)' and how they perturb these distributions.
Dataset Splits	No	The paper discusses 'cross-validation techniques' for parameter selection, but it does not specify how the empirical data itself is split into training, validation, or test sets with percentages or sample counts. The experiments are based on drawing samples from described distributions.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions) that would be needed to replicate the experiments. It only mentions using 'the code provided in (Acharya et al., 2017)' for ADLS.
Experiment Setup	Yes	Algorithm 1 TURF provides parameters (t, d, α, k, β, ηd) used in the algorithm. Section 7, 'Experiments', states: 'run for n between 1,000 and 80,000, and averaged over 50 runs.' It also specifies parameters for synthetic data generation: 'We choose k = 100 and c2 = 0.05, 1, 0.1 for the Beta, Gamma and Gaussian mixtures respectively, to yield the distributions shown in Figure 1.' It also notes 'd=1' and 'd=2' for figures.