Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Streaming, Distributed Variational Inference for Bayesian Nonparametrics
Authors: Trevor Campbell, Julian Straub, John W. Fisher III, Jonathan P. How
NeurIPS 2015 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, the proposed inference framework is evaluated on the DP Gaussian mixture with a normal-inverse-Wishart (NIW) prior. We compare the streaming, distributed procedure coupled with standard variational inference [24] (SDA-DP) to five state-of-the-art inference algorithms: memoized online variational inference (mo VB) [13], stochastic online variational inference (SVI) [9] with learning rate (t+10) 1/2 , sequential variational approximation (SVA) [7] with cluster creation threshold 10−1 and prune/merge threshold 10−3, subcluster splits MCMC (SC) [14], and batch variational inference (Batch) [24]. ... Figure 3 shows the results from the experiment over 30 trials... |
| Researcher Affiliation | Academia | Trevor Campbell1 Julian Straub2 John W. Fisher III2 Jonathan P. How1 1LIDS, 2CSAIL, MIT {tdjc@ , jstraub@csail. , fisher@csail. , jhow@}mit.edu |
| Pseudocode | No | The paper describes the algorithm steps in narrative and with diagrams (Figure 1), but does not provide a formally structured pseudocode block or a section explicitly labeled 'Algorithm'. |
| Open Source Code | No | The paper states 'For the experiments in this work, we used the implementation at github.com/hrldcpr/hungarian.' which refers to a third-party library used, not the open-sourcing of the authors' own methodology. |
| Open Datasets | Yes | MNIST Digits [25]: This dataset consisted of 70,000 28 × 28 images of hand-written digits, with 10,000 held out for testing. ... SUN Images [26]: This dataset consisted of 108,755 images from 397 scene categories, with 8,755 held out for testing. ... [25] Yann Le Cun, Corinna Cortes, and Christopher J.C. Burges. MNIST database of handwritten digits. Online: yann.lecun.com/exdb/mnist. [26] Jianxiong Xiao, James Hays, Krista A. Ehinger, Aude Oliva, and Antonio Torralba. SUN 397 image database. Online: vision.cs.princeton.edu/projects/2010/SUN. |
| Dataset Splits | No | The paper specifies held-out test sets for MNIST (10,000), SUN (8,755), and Airplane Trajectories (1,000), but it does not explicitly state details for a separate validation set or its split percentage. |
| Hardware Specification | No | The paper mentions 'All experiments were performed on a computer with 24 CPU cores and 12Gi B of RAM,' which describes the general hardware specifications but lacks specific details like CPU model, GPU model, or clock speeds. |
| Software Dependencies | No | The paper mentions using 'github.com/hrldcpr/hungarian' but does not provide specific version numbers for this or any other software dependencies like programming languages or libraries. |
| Experiment Setup | Yes | SDA-DP minibatch inference was truncated to K = 50 components, and all other algorithms were truncated to K = 200 components. ... Stochastic variational inference (SVI) [9] with learning rate (t+10) 1/2 , sequential variational approximation (SVA) [7] with cluster creation threshold 10−1 and prune/merge threshold 10−3 ... minibatches of size 50. ... Data was split into minibatches of size 100... Data was split into minibatches of size 500. |