reproducibilityindex.ai

Lossless Compression with Probabilistic Circuits

Authors: Anji Liu, Stephan Mandt, Guy Van den Broeck

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, our PC-based (de)compression algorithm runs 5-40 times faster than neural compression algorithms that achieve similar bitrates. By scaling up the traditional PC structure learning pipeline, we achieve state-of-the-art results on image datasets such as MNIST.
Researcher Affiliation	Academia	Anji Liu CS Department UCLA liuanji@cs.ucla.edu Stephan Mandt CS Department University of California, Irvine mandt@uci.edu Guy Van den Broeck CS Department UCLA guyvdb@cs.ucla.edu
Pseudocode	Yes	Algorithm 1 Compute F(x) (see Alg. 3 for details)
Open Source Code	Yes	Our open-source implementation of the PC-based (de)compression algorithm can be found at https: //github.com/Juice-jl/Pressed Juice.jl.
Open Datasets	Yes	Our experiments show that on MNIST and EMNIST, the PC-based compression algorithm achieved So TA bitrates. On more complex data such as subsampled Image Net, we hybridize PCs with normalizing ﬂows and show that PCs can signiﬁcantly improve the bitrates of the base normalizing ﬂow models.
Dataset Splits	No	The paper mentions using datasets like MNIST but does not provide specific training/validation/test split percentages or sample counts, nor does it refer to predefined splits with citations for reproducibility beyond generic dataset names.
Hardware Specification	Yes	The compression (resp. decompression) time are the total computation time used to encode (resp. decode) all 10,000 MNIST test samples on a single TITAN RTX GPU. All experiments are performed on a server with 72 CPUs, 512G Memory, and 2 TITAN RTX GPUs. In all experiments, we only use a single GPU on the server.
Software Dependencies	No	The paper mentions software like PyTorch, Juice.jl, and rANS, but does not provide specific version numbers for these software dependencies as used in their experiments.
Experiment Setup	Yes	For the PCs, we adopted Ei Nets (Peharz et al., 2020a) with hyperparameters K = 12 and R = 4. Instead of using random binary trees to deﬁne the model architecture, we used binary trees where closer latent variables in z will be put closer in the binary tree. Parameter learning was performed by the following steps. First, compute the average log-likelihood over a mini-batch of samples. The negative average log-likelihood is the loss we use. Second, compute the gradients w.r.t. all model parameters by backpropagating the loss. Finally, update the IDF and PCs using the gradients individually: for IDF, following Hoogeboom et al. (2019), the Adamax optimizer was used; for PCs, following Peharz et al. (2020a), we use the gradients to compute the EM target of the parameters and performed mini-batch EM updates.