reproducibilityindex.ai

Distributed Machine Learning with Sparse Heterogeneous Data

Authors: Dominic Richards, Sahand Negahban, Patrick Rebeschini

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Statistical Performance Figure 1 plots the probability of recovery against the number of samples held by non-root nodes Nv for v V{1} with a ﬁxed number of root agent samples N1 = 2s log(ed/s) . Observe, for a path topology and balanced tree topology, once the non-root nodes have beyond approximately 30 samples, the solution to TVBP ﬁnds the correct support for all of graph sizes. In contrast, the number of samples required to recover a signal with Basis Pursuit at the same level of sparsity and dimension considered would require at least 80 samples, i.e. 2s log(ed/s). We therefore save approximately 50 for each non-root problem.
Researcher Affiliation	Academia	Dominic Richards Department of Statistics University of Oxford 24-29 St Giles , Oxford, OX1 3LB Dominic.Richards94@gmail.com Sahand N Negahban Department of Statistics and Data Science Yale University 24 Hillhouse Ave., New Haven, CT 06510 Sahand.Negahban@Yale.edu Patrick Rebeschini Department of Statistics University of Oxford 24-29 St Giles , Oxford, OX1 3LB Patrick.Rebeschini@stats.ox.ac.uk
Pseudocode	No	The paper describes algorithm steps in text within Appendix A.2 'ADMM for TVBP' but does not present them in a formally structured pseudocode or algorithm block.
Open Source Code	No	The paper does not provide an explicit statement about releasing its source code for the described methodology or a link to a code repository.
Open Datasets	Yes	Hyperspectral Unmixing. We apply Total Variation Basis Pursuit Denoising to the popular AVIRIS Cuprite mine reﬂectance dataset https://aviris.jpl.nasa.gov/data/free_data.html with a subset of the USGS library splib07 [26].
Dataset Splits	No	The paper describes the number of samples Nv for agents and the problem parameters (d, s, s) but does not provide explicit train/test/validation dataset split percentages, counts, or a specific splitting methodology.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used to run its experiments.
Software Dependencies	No	The paper mentions software packages like 'SPGL1 Python package', 'CVXOPT', and 'SUNn SAL' but does not provide specific version numbers for these or other ancillary software components.
Experiment Setup	Yes	Statistical Performance Figure 1 plots the probability of recovery against the number of agents samples Nv for v V{1} with a ﬁxed number of root agent samples N1 = 2s log(ed/s) . Problem setting d = 128, s = 12, s = 4 and N1 = 2s log(ed/s) = 80, for path (Left) and balance tree with branches of size 2 (Right). Lines indicates graph sizes with n {2, 4, 8, 16} for path and n {7, 15, 31} for balanced tree with heights of {2, 3, 4} respectively. Solution to reformulated problem (11) found using CVXOPT. Each point is an average of 20 replications. Signal sampled from {1, 1}, differences concatenation of s values. {Av}v V standard Gaussian and e G = G.