Learning Restricted Boltzmann Machines with Sparse Latent Variables

Authors: Guy Bresler, Rares-Darius Buhai

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We give an algorithm for learning general RBMs with time complexity O(n2s+1), where s is the maximum number of latent variables connected to the MRF neighborhood of an observed variable. This is an improvement when s < log2(d 1), which corresponds to RBMs with sparse latent variables. Furthermore, we give a version of this learning algorithm that recovers a model with small prediction error and whose sample complexity is independent of the minimum potential in the Markov Random Field of the observed variables. ... Our algorithm is an extension of the algorithm of [10] for learning MRFs. ... Theorem 6. Fix ω > 0. Suppose we are given M samples from an RBM, where M = O( (τ )2(e 2γ)2L (log(1/ω) + log(L + 2s + 1) + (L + 2s + 1) log(2n) + log 2) ). Then with probability at least 1 ω, our algorithm, when run from each observed variable u, recovers the correct neighborhood of u. Each run of the algorithm takes O(MLn2s+1) time.
Researcher Affiliation Academia Guy Bresler MIT guy@mit.edu Rares-Darius Buhai MIT rbuhai@mit.edu Current affiliation: ETH Zurich, rares.buhai@inf.ethz.ch.
Pseudocode Yes To find the MRF neighborhood of an observed variable u (i.e., observed variable Xu; we use the index and the variable interchangeably when no confusion is possible), our algorithm takes the following steps, similar to those of the algorithm of [10]: 1. Fix parameters s, τ , L. Fix observed variable u. Set S := . 2. While |S| L and there exists a set of observed variables I [n] \ {u} \ S of size at most 2s such that ˆνu,I|S > τ , set S := S I. 3. For each i S, if ˆνu,i|S\{i} < τ , remove i from S. 4. Return set S as an estimate of the neighborhood of u.
Open Source Code No The paper does not include any statements about providing open-source code for the methodology, nor does it provide links to any code repositories.
Open Datasets No The paper is theoretical and focuses on algorithmic complexity and sample complexity bounds. It does not describe experiments performed on specific datasets, public or otherwise, for training or evaluation. It refers to "M samples from an RBM" in a theoretical context to define sample complexity, but not as an actual dataset used.
Dataset Splits No The paper is theoretical and does not describe empirical experiments involving datasets. Therefore, there are no details provided regarding training, validation, or test dataset splits.
Hardware Specification No The paper is theoretical and focuses on algorithm design and complexity analysis. It does not describe any empirical experiments or mention specific hardware used for computations, such as GPU/CPU models or cloud resources.
Software Dependencies No The paper is theoretical and focuses on algorithmic properties. It does not mention any specific software dependencies or version numbers required to implement or run the described algorithms.
Experiment Setup No The paper is theoretical and describes an algorithm along with its theoretical guarantees and complexities. While it mentions parameters like 's, τ, L' for the algorithm, these are part of the theoretical model, not specific experimental setup details like hyperparameters, training configurations, or system-level settings used in an empirical study.