Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Sum of Squares Circuits

Authors: Lorenzo Loconte, Stefan Mengel, Antonio Vergari

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we empirically show the effectiveness of sum of squares circuits in performing distribution estimation. ...Finally, we empirically validate the increased expressiveness of sum of squares circuits for distribution estimation, showing they can scale to real-world data when tensorized (Section 7). ...We evaluate structured monotonic (+sd), squared PCs ( 2 R, 2 C), their sums and µSOCS as the product of a monotonic and a SOCS PC (+sd Σ2 cmp, see Definition 5) on distribution estimation tasks using both continuous and discrete real-world data.
Researcher Affiliation	Academia	Lorenzo Loconte1, Stefan Mengel2, Antonio Vergari1 1School of Informatics, University of Edinburgh, UK 2University of Artois, CNRS, Centre de Recherche en Informatique de Lens (CRIL), France
Pseudocode	No	The paper describes algorithms like the Multiply algorithm conceptually (e.g., "Multiplying two compatible circuits c1, c2 can be done via the Multiply algorithm in time O(\|c1\|\|c2\|) as described in Vergari et al. (2021) and which we report in Appendix A.1") but does not provide structured pseudocode or algorithm blocks in the main text.
Open Source Code	Yes	Code https://github.com/april-tools/sos-npcs
Open Datasets	Yes	We estimate the distribution of four continuous UCI data sets: Power, Gas, Hepmass, Mini Boo NE, using the same preprocessing by Papamakarios, Pavlakou, and Murray (2017) (Table C.1). ...We estimate the probability distribution of MNIST, Fashion MNIST and Celeb A images (Table C.2)
Dataset Splits	Yes	For all UCI data sets, we preprocess them as in Papamakarios, Pavlakou, and Murray (2017), which includes standard z-normalization and random splits for training, validation, and test sets. Specifically, we use an 80/10/10 split respectively. We use the official splits for MNIST, Fashion MNIST, and CelebA.
Hardware Specification	No	The paper mentions that models were trained and experiments were performed, but it does not specify any particular hardware components such as GPU models, CPU models, or memory details.
Software Dependencies	No	The paper does not provide specific software names with version numbers, such as Python, PyTorch, or CUDA versions.
Experiment Setup	Yes	Given a training set D = {x(i)}N i=1 on variables X, we are interested in estimating p(X) from D by minimizing the parameters negative log-likelihood on a batch B D, i.e., L := \|B\| log Z P x B log c(x), via gradient descent. ...For all UCI data sets, we train our models for 500 epochs using the Adam optimizer with a learning rate of 1e-3 and a batch size of 128.