Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Correcting Predictions for Approximate Bayesian Inference
Authors: Tomasz Kuśmierczyk, Joseph Sakaya, Arto Klami4511-4518
AAAI 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the approach empirically in several problems, confirming its potential. ... We conduct a series of experiments, using variational approximation as q(θ). We first compare the method against the alternative of calibrating the posterior inference to account for the loss for a matrix factorization model, and then demonstrate improved decisions for a sparse regression model and a multilevel model for cases with approximations. |
| Researcher Affiliation | Academia | Tomasz Ku smierczyk, Joseph Sakaya, Arto Klami Helsinki Institute for Information Technology HIIT Department of Computer Science, University of Helsinki {tomasz.kusmierczyk, joseph.sakaya, arto.klami}@helsinki.fi |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | The code to reproduce our experiments is available online.1 1https://github.com/tkusmierczyk/correcting approximate bayesian predictions |
| Open Datasets | Yes | on a subset of last.fm data (Bertin-Mahieux et al. 2011). ... We use the radon data and the multi-level model... (Gelman and Hill 2006). ... We apply the model on the corn data (Chen and Martin 2009). |
| Dataset Splits | Yes | we split the data randomly into equally sized training and test set. ... randomly split into equally sized training and test subsets. |
| Hardware Specification | No | The paper mentions 'computational resources' in the acknowledgements but does not specify any particular hardware components like CPU or GPU models, or memory details. |
| Software Dependencies | No | The paper mentions 'probabilistic programming tools, such as Stan (Carpenter et al. 2017) and Edward (Tran et al. 2016)' and that a model is 'implemented using the publicly available Stan code', but it does not provide specific version numbers for these or other software dependencies used in the experiments. |
| Experiment Setup | Yes | In our experiments, we use a simple feed-forward network with 3 hidden layers (with 20, 20 and 10 nodes) with Re LU activation and Adam optimizer with learning rate = 0.01... ...we use L = 5 in our empirical experiments. ... we use S = 1000 in other experiments. ... we use B = 20 quantiles. |