Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Multimodal Poisson Gamma Belief Network
Authors: Chaojie Wang, Bo Chen, Mingyuan Zhou
AAAI 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental results on bi-modal data consisting of images and tags show that the m PGBN can easily impute a missing modality and hence is useful for both image annotation and retrieval. We further demonstrate that the m PGBN achieves state-of-the-art results on unsupervisedly extracting latent features from multimodal data. |
| Researcher Affiliation | Academia | Chaojie Wang, Bo Chen National Laboratory of Radar Signal Processing Collaborative Innovation Center of Information Sensing & Understanding Xidian University, Xi an, Shaanxi, China Mingyuan Zhou Mc Combs School of Business University of Texas at Austin Austin, TX 78712, USA |
| Pseudocode | No | The paper provides mathematical formulations of the model but does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions 'Publicly available code (Vedaldi and Fulkerson 2010; Bastan et al. 2010) could be used to extract these features' which refers to third-party code. It does not provide any link or explicit statement about releasing the source code for the proposed m PGBN model. |
| Open Datasets | Yes | We use in our experiments the MIR-Flicker data set (Huiskes and Lew 2008), which consists of 1 million images along with their user assigned tags that are retrieved from the social photography website Flicker. |
| Dataset Splits | No | The paper specifies training and testing splits: '15k image-text pairs are used for training and the remaining 10k pairs for testing'. However, it does not explicitly state a distinct 'validation' dataset split for hyperparameter tuning or early stopping, though some internal evaluation (e.g., 'infer a set of networks') might implicitly serve that role without being explicitly named as a validation set. |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware used to run the experiments, such as GPU models, CPU types, or memory. |
| Software Dependencies | No | The paper mentions 'Publicly available code (Vedaldi and Fulkerson 2010; Bastan et al. 2010)' for feature extraction, but does not provide specific version numbers for these or any other software dependencies crucial for replication (e.g., programming languages, libraries, frameworks). |
| Experiment Setup | Yes | For hyper-parameters, we set ηt = 0.05 for all t, a0 = b0 = 0.01, and e0 = f0 = 1. We use 15k image-text pairs randomly selected from MIR-Flicker 25k to infer a set of networks with T {1, 2, 3, 4, 5} and Kt {50, 100, 200, 400, 800}, and apply the upward-downward Gibbs sampler to collect 200 MCMC samples after 200 burn-in to estimate the posterior mean of the latent representation of each test data sample. We choose a two-hidden-layer m PGBN, with 1024 hidden units in both hidden layers. We use 1000 Gibbs sampling iterations to train the m PGBN on the 15k training image-text pairs, and retain the inferred network (global variables) of the last sample. For each test image-text pair, we collect 500 MCMC samples after 500 burn-in iterations to infer its latent representation (local variables) under the network retained after training. |