Sparse Boltzmann Machines with Structure Learning as Applied to Text Analysis
Authors: Zhourong Chen, Nevin Zhang, Dit-Yan Yeung, Peixian Chen
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results show that the method yields models with significantly improved model fit and interpretability as compared with RBMs where each hidden unit is connected to all visible units.In this section we test the performance of SBMs on three text datasets of different scales: NIPS proceeding papers, Cite ULike articles, and New York Times dataset. Experimental results show that SBMs perform consistently well over the three datasets in terms of model generalizability, and SBMs always give much better interpretability. |
| Researcher Affiliation | Academia | Zhourong Chen, Nevin L. Zhang, Dit-Yan Yeung, Peixian Chen Hong Kong University of Science and Technology {zchenbb,lzhang,dyyeung,pchenac}@cse.ust.hk |
| Pseudocode | Yes | Algorithm 1 SBM-SFC(T) |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that the source code for the described methodology is publicly available. Footnotes provide links to datasets and a third-party word2vec model, but not the authors' implementation. |
| Open Datasets | Yes | NIPS proceeding papers consist of 1,740 NIPS papers published from 1987 to 1999. We randomly sample 1,640 papers as training data, 50 as validation data and the remaining 50 as test data. (http://www.cs.nyu.edu/ roweis/data.html)Cite ULike article collection contains 16,980 articles. Similarly, we randomly divide it into training data with 12,000 articles, validation data with 1,000 articles and test data with 3,980 articles. (http://www.wanghao.in/data/ctrsr_datasets.rar)The New York Times dataset includes 300,000 documents, among which we randomly pick 290,000 documents for training, 1,000 for validation and 9,000 for testing. (http://archive.ics.uci.edu/ml/datasets/Bag+of+Words) |
| Dataset Splits | Yes | We randomly sample 1,640 papers as training data, 50 as validation data and the remaining 50 as test data. (for NIPS); we randomly divide it into training data with 12,000 articles, validation data with 1,000 articles and test data with 3,980 articles. (for Cite ULike); randomly pick 290,000 documents for training, 1,000 for validation and 9,000 for testing. (for New York Times). |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. It discusses datasets, training parameters, and evaluation metrics, but omits hardware information. |
| Software Dependencies | No | The paper mentions using a 'word2vec model' and the 'CD algorithm' but does not specify any software dependencies with version numbers (e.g., library names like PyTorch, TensorFlow, or specific versions of Python). |
| Experiment Setup | Yes | The batch sizes of dataset NIPS, Cite ULike and New York Times are 10, 100 and 1,000 respectively. Model parameters are updated after each mini-batch. We set the maximum number of training epochs to 50. And we train all the models using the CD algorithm with T = 10 full Gibbs steps. As for RBM-based Replicated Softmax, we determine the optimal number of hidden units over the validation data with 10 units as the step size. While for SBMs, we firstly train a two-layer HLTM and then increase the number of connections such that every hidden unit is connected to 20% of the visible units that are most correlated. A mask matrix is applied to the connection matrix after each parameter update so as to force the sparse connectivity. |