Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Monotone deep Boltzmann machines

Authors: Zhili Feng, Ezra Winston, J Zico Kolter

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	As a proof of concept, we evaluate our proposed m DBM on the MNIST and CIFAR-10 datasets. We demonstrate how to jointly model missing pixels and class labels conditioned on only a subset of observed pixels. On MNIST, we compare m DBM to mean-ﬁeld inference in a traditional deep RBM. Despite being small-scale tasks, the goal here is to demonstrate joint inference and learning over what is still a reasonably-sized joint model, considering the number of hidden units. Nonetheless, the current experiments are admittedly largely a demonstration of the proposed method rather than a full accounting of its performance. We also show how our mean-ﬁeld inference method compares to those proposed in prior works. On the joint imputation and classiﬁcation task, we train models using our updates and the updates proposed by Krähenbühl & Koltun (2013) and Baqué et al. (2016), and perform mean-ﬁeld inference in each model using all three update methods, with and without the monotonicity constraint. Pixel imputation is shown in Figure 3. We report the image imputation ℓ2 loss on MNIST in Table 1. We additionally evaluate m DBM on a task in which random 14 14 patches are masked. We evaluate m DBM on an analogous task of image pixel imputation and label prediction on CIFAR-10. The imputation error is reported in Table 2.
Researcher Affiliation	Collaboration	Zhili Feng EMAIL Machine Learning Department Carnegie Mellon University Ezra Winston EMAIL Machine Learning Department Carnegie Mellon University J. Zico Kolter EMAIL Computer Science Department Carnegie Mellon University Bosch Center for AI
Pseudocode	Yes	Algorithm 1 Forward Iteration Algorithm 2 Backward Iteration Algorithm 3 Training
Open Source Code	Yes	We describe the details in the appendix and include an eﬃcient Py Torch function implementation in the supplementary material.
Open Datasets	Yes	As a proof of concept, we evaluate our proposed m DBM on the MNIST and CIFAR-10 datasets.
Dataset Splits	Yes	For the joint imputation and classiﬁcation task, we randomly mask each pixel independently with probability 60%, such that in expectation only 40% of the pixels are observed. We randomly mask p = {0.2, 0.4, 0.6, 0.8} portion of the inputs. For each p, the experiments are conducted 5 times where each run independently chooses the random mask. For CIFAR-10, we train for 100 epochs using standard data augmentation; during the ﬁrst 10 epochs, the weight on the reconstruction loss is ramped up from 0.0 to 0.5 and the weight on the classiﬁcation loss ramped down from 1.0 to 0.5; also during the ﬁrst 20 epochs, the percentage of observation pixels is ramped down from 100% to 50%.
Hardware Specification	No	The paper states "we derive a highly eﬃcient GPU-based implementation" but does not specify any particular GPU model or other hardware details.
Software Dependencies	No	The paper mentions "Py Torch function implementation" but does not provide a specific version number for PyTorch or any other software dependencies.
Experiment Setup	Yes	Treating the image reconstruction as a dense classiﬁcation task, we use cross-entropy loss and class weights 1 β 1 βni with β = 0.9999 (Cui et al., 2019), where ni is the number of times pixels with intensity i appear in the hidden pixels. For classiﬁcation, we use standard cross-entropy loss. To enable joint training, we put equal weight of 0.5 on both task losses and backpropagate through their sum. For both tasks, we put τiΦq i into the cross-entropy loss as logits, as described in Equation (17). To achieve faster damped forward-backward iteration, we implement Anderson acceleration (Walker & Ni, 2011), and stop the ﬁxed point update as soon as the relative diﬀerence between two iterations (that is, qt+1 qt / qt ) is less than 0.01, unless we hit a maximum number of 50 allowed iterations. For proxα f and the damped iteration, we set α = 0.125. We use the Adam optimizer with learning rate 0.001. For MNIST, we train for 40 epochs. For CIFAR-10, we train for 100 epochs using standard data augmentation; during the ﬁrst 10 epochs, the weight on the reconstruction loss is ramped up from 0.0 to 0.5 and the weight on the classiﬁcation loss ramped down from 1.0 to 0.5; also during the ﬁrst 20 epochs, the percentage of observation pixels is ramped down from 100% to 50%. The deep RBM is trained using CD-1 algorithm for 100 epochs with a batch size of 128 and learning rate of 0.01.