Deep Generative Models for Relational Data with Side Information
Authors: Changwei Hu, Piyush Rai, Lawrence Carin
ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compare with various state-of-the-art methods and report results, both quantitative and qualitative, on several benchmark data sets. |
| Researcher Affiliation | Collaboration | 1Yahoo! Research, New York, NY, USA 2CSE Department, IIT Kanpur, Kanpur, UP, India 3Duke University, Durham, NC, USA. |
| Pseudocode | No | The paper describes the steps of the Gibbs sampler in prose but does not provide structured pseudocode or an algorithm block. |
| Open Source Code | No | The paper does not include an unambiguous statement about releasing code for the described work, nor does it provide a direct link to a source-code repository. |
| Open Datasets | Yes | We consider seven real-world data sets... Protein230... NIPS234... Conflicts... Facebook... Metabolic... NIPS 1-17... Cite Seer... (Ghosn et al., 2004) |
| Dataset Splits | No | For the two data sets without side information (Protein230 and NIPS234), we hold out 20% data as our test data. For the remaining five data sets, we hold out 80% data as our test data as we were interested in highly missing data regimes to investigate how much the side information is benefitting in such difficult cases. (No explicit mention of a validation set or split for model tuning.) |
| Hardware Specification | Yes | All the models are implemented in MATLAB and were run on a standard machine with 2.40GHz processor and 16GB RAM. |
| Software Dependencies | No | The paper mentions that models are 'implemented in MATLAB' but does not provide specific version numbers for MATLAB or any other ancillary software components. |
| Experiment Setup | Yes | We set K to a large enough number (K = 100) so that all models are evaluated with sufficient number of latent features. Our models and the other baselines (except HGP-EPM) are run with 1000 burn-in iterations, and another 1000 iterations for sample collection. For the HGP-EPM baseline, we use the default setting from (Zhou, 2015) and run their model for 3000 burn-in and 1000 collection iterations. The samplers are initialized randomly. |