Stochastic Inference for Scalable Probabilistic Modeling of Binary Matrices
Authors: Jose Miguel Hernandez-Lobato, Neil Houlsby, Zoubin Ghahramani
ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our method experimentally, demonstrating faster convergence than batch alternatives (Raiko et al., 2007) and yielding more accurate solutions than existing scalable variational methods (Nakajima et al., 2010; Seeger & Bouchard, 2012; Paquet & Koenigstein, 2013). SIBM is evaluated in experiments with synthetic and real-world binary matrices. |
| Researcher Affiliation | Academia | Jos e Miguel Hern andez-Lobato1 JMH233@CAM.AC.UK Neil Houlsby1 NMTH2@CAM.AC.UK Zoubin Ghahramani ZOUBIN@ENG.CAM.AC.UK University of Cambridge, Department of Engineering, Cambridge CB2 1PZ, UK |
| Pseudocode | Yes | Algorithm 1 Stochastic Inference for Binary Matrices |
| Open Source Code | Yes | All the code and data used is publicly available1. http://jmhl.org |
| Open Datasets | Yes | We consider two real-world datasets from the FIMI repository: purchase data from a retail store (retail) (Brijs et al., 1999) and click data from an online news portal (Kosarak). We include two datasets from the 2000 KDD Cup (Kohavi et al., 2000; Zheng et al., 2001), point of sale data from a retailer (POS, originally BMS-POS) and click data from an e-commerce website (Web View, originally BMS-Web View-2). Finally, we include the Netflix data, treating 4-5 star ratings as ones. All the code and data used is publicly available1. http://jmhl.org |
| Dataset Splits | Yes | Each matrix is randomly split into a training matrix and a set of test entries with value one. The training matrix is generated by randomly removing an entry with value one from each row in the original matrix and adding it to the test set. Predictive performance is evaluated using recall at N, a popular metric for recommendation tasks (Gunawardana & Shani, 2009) (equivalent to precision when a single one is held out from each row). For this, we iterate over the rows, using (3) to compute the probability of each zero entry actually taking value one. We select the top N zero entries with highest probability in that row. Recall is computed as the average number of times that the test entry appears in this list. We use N = 10 and repeat the experiment 25 times on each small dataset, and 10 times on each large one. |
| Hardware Specification | No | The paper states that "all methods are coded in C" but does not specify any hardware details such as GPU models, CPU types, or memory specifications used for the experiments. |
| Software Dependencies | No | The paper states that "all methods are coded in C" but does not provide specific software dependencies (e.g., library names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | We pre-process the original datasets so that we can compare to the computationally expensive batch approach. We keep the 1000 columns with the highest number of ones and discard rows with fewer than 10 ones. We consider small and large versions of each dataset. We subsample 2000 rows for the small and 40,000 rows for the large datasets, except in retail and Web View, where we use approximately the maximum number of rows for the large datasets, 10,000 and 5000, respectively. We fix D = 10 and generate U and V by sampling all the ui,d and vj,d independently from N(0, 1). The global bias is fixed to z = 7.5. We use N = 10 and repeat the experiment 25 times on each small dataset, and 10 times on each large one. |