reproducibilityindex.ai

Distribution Compression in Near-Linear Time

Authors: Abhishek Shetty, Raaz Dwivedi, Lester Mackey

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our benchmarks with high-dimensional Monte Carlo samples and Markov chains targeting challenging differential equation posteriors, Compress++ matches or nearly matches the accuracy of its input algorithm in orders of magnitude less time.
Researcher Affiliation	Collaboration	1 Department of EECS, UC Berkeley 2 Department of Computer Science, Harvard University and Department of EECS, MIT 3 Microsoft Research New England
Pseudocode	Yes	Algorithm 1: COMPRESS
Open Source Code	Yes	See the goodpoints Python package for Python implementations of all methods in this paper and https://github.com/microsoft/goodpoints for code reproducing each experiment.
Open Datasets	Yes	For the i.i.d. targets, we report MMDk(P, Pout) which can be exactly computed in closed-form. For the MCMC targets, we report the thinning error MMDk(Pin, Pout) analyzed directly by our theory (Thms. 2 and 4). ... we adopt the four posterior targets of Riabiz et al. (2020a) based on the Goodwin (1965) model of oscillatory enzymatic control (d = 4), the Lotka (1925); Volterra (1926) model of oscillatory predator-prey evolution (d = 4), the Hinch et al. (2004) model of calcium signalling in cardiac cells (d = 38), and a tempered Hinch model posterior (d = 38).
Dataset Splits	No	The paper describes compressing input point sequences and evaluating the MMD error and runtime of the resulting coresets. It does not mention or specify traditional train/validation/test dataset splits, as its focus is on distribution compression rather than supervised model training.
Hardware Specification	Yes	All runtimes were measured on a single core of an Intel Xeon CPU.
Software Dependencies	No	The paper mentions 'goodpoints Python package' and 'Python implementations' but does not specify exact version numbers for Python or any other software libraries.
Experiment Setup	Yes	For COMPRESS++, we use g = 4 throughout to satisfy the small relative error criterion (11) in all experiments. ... In all experiments involving kernel thinning, we set the algorithm failure probability parameter δ = 1/2 ... Throughout we use a Gaussian kernel k(x, y) = exp( 1/(2σ^2) \|\|x - y\|\|^2 ) with σ^2 as specified by Dwivedi & Mackey (2021, Sec. K.2) for the MCMC targets and σ^2 = 2d otherwise.