General Table Completion using a Bayesian Nonparametric Model
Authors: Isabel Valera, Zoubin Ghahramani
NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, our experiments over five real databases show that the proposed approach provides more robust and accurate estimates than the standard IBP and the Bayesian probabilistic matrix factorization with Gaussian observations. |
| Researcher Affiliation | Academia | Isabel Valera Department of Signal Processing and Communications University Carlos III in Madrid ivalera@tsc.uc3m.es Zoubin Ghahramani Department of Engineering University of Cambridge zoubin@eng.cam.ac.uk |
| Pseudocode | Yes | Algorithm 1 Inference Algorithm. |
| Open Source Code | Yes | An efficient C-code implementation for Matlab of the proposed table completion tool is also released on the authors website. |
| Open Datasets | Yes | Statlog German credit dataset [5]... Dataset available on: http://archive.ics.uci.edu/ml/datasets.html |
| Dataset Splits | No | The paper discusses average test log-likelihood per missing datum but does not provide specific details on train/validation/test dataset splits, such as percentages, sample counts, or cross-validation setup. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | An efficient C-code implementation for Matlab of the proposed table completion tool is also released on the authors website. |
| Experiment Setup | Yes | For the GIBP, we consider for the real positive and the count data the following transformation, that maps from the real numbers to the real positive numbers, f(x) = log(exp(wx) + 1), where w is a user hyper-parameter.For the BPMF model, we have used different numbers of latent features (in particular, 10, 20 and 50), although we only show the best results for each database, specifically, K = 10 for the NESARC and the wine databases, and K = 50 for the remainder. |