HyperSPNs: Compact and Expressive Probabilistic Circuits

Authors: Andy Shih, Dorsa Sadigh, Stefano Ermon

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experiment with Hyper SPN on three sets of density estimation tasks: the Twenty Datasets benchmark, the Amazon Baby Registries benchmark, and the Street View House Numbers (SVHN) dataset. All runs were done using a single GPU. We primarily compare with weight decay as the competing regularization method. We also include comparisons with smaller SPNs that have the same degrees of freedom as our Hyper SPN, and with results reported from other works in literature.
Researcher Affiliation Academia Andy Shih Stanford University andyshih@cs.stanford.edu Dorsa Sadigh Stanford University dorsa@cs.stanford.edu Stefano Ermon Stanford University ermon@cs.stanford.edu
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide any information about open-source code availability.
Open Datasets Yes We verify the empirical performance of Hyper SPNs on density estimation tasks for the Twenty Datasets benchmark [12], the Amazon Baby Registries benchmark [8], and the Street View House Numbers (SVHN) [20] dataset
Dataset Splits No We plot the training curves on the training and validation data for the Plants and for Pumsb-star datasets.
Hardware Specification No All runs were done using a single GPU.
Software Dependencies No The paper mentions the use of “Adam” as an optimizer, but does not provide specific software names with version numbers for any libraries or dependencies used.
Experiment Setup Yes We use the RAT-SPN structure described in Section 2.1, choosing layer size parameter k = 5 and replicas r = 50, randomizing the variable orders for each replica. ... For Weight Decay, we vary the weight decay value between 1e-3, 1e-4, and 1e-5. ... The Hyper SPN uses the exact same underlying SPN structure, along with an external neural network (a 2 layer MLP of width 20) and embeddings of dimension ranging between h = 5, 10, 20.