Small-Variance Asymptotics for Nonparametric Bayesian Overlapping Stochastic Blockmodels
Authors: Gundeep Arora, Anupreet Porwal, Kanupriya Agarwal, Avani Samdariya, Piyush Rai
IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our results on several benchmark datasets demonstrate that our algorithm is competitive to methods such as MCMC, while being much faster. |
| Researcher Affiliation | Academia | Gundeep Arora, Anupreet Porwal, Kanupriya Agarwal, Avani Samdariya, Piyush Rai Indian Institute of Technology, Kanpur {gundeep, anupreet, kagarwal, savani, rpiyush}@iitk.ac.in |
| Pseudocode | Yes | Algorithm 1 K-LAFTER |
| Open Source Code | No | The paper does not provide a link to open-source code or explicitly state that the code for the described methodology is being released. |
| Open Datasets | Yes | We report experimental results on the following benchmark datasets, also used in other prior work on LFRM [Miller et al., 2009] and other stochastic blockmodels [Zhou, 2015]. Lazega-Lawyers[Lazega, 2001]: This dataset constitutes of three small-scale networks and is based on corporate law partnership. Protein230 Network[Butland et al., 2005]: This dataset consists of the interaction between 230 different proteins given in form of an adjacency matrix. NIPS234 Coauthor Network[Miller et al., 2009]:The NIPS234 network consists of 234 nodes with the relation describing the coauthorship of top 234 authors, by number of publications, in NIPS 1-17. |
| Dataset Splits | Yes | For our link-prediction experiments, we train all the models using 80% of randomly chosen entries in the matrix Y data and the remaining 20% of data is used to test the trained model. We consider five random training-testing partitions for all datasets and report the average Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC). Our model has only one free hyperparameter λ, which we tune using k-fold cross-validation technique on the training dataset. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers. |
| Experiment Setup | Yes | Our model has only one free hyperparameter λ, which we tune using k-fold cross-validation technique on the training dataset. We would like to note that the performance of our algorithm is fairly insensitive to the exact choice of λ; in most cases, λ = 0.5 worked well. We initialize our K-LAFTER algorithm (which we will refer to as LFRM-SVA in the rest of this section) with K = 1. Initializing with larger K is leads to slightly faster convergence. On all the datasets, our SVA based algorithm converged within 100 iterations if initialized with K = 1, and in as few as 10 iterations if initialized with larger K (e.g., K = 10). The MCMC sampling based LFRM (referred to as LFRM-MCMC) was run for 1000 iterations with 500 burnin and 500 collection iterations. |