Efficient Score Matching with Deep Equilibrium Layers
Authors: Yuhao Huang, Qingsong Wang, Akwum Onwunta, Bao Wang
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we evaluate the performance of the proposed DEQ-assisted score matching models, including score matching variational autoencoder (SMVAE) [40] in Section 4.1 and noise conditional score network (NCSN) [40] in Section 4.4 for generative modeling. We also consider deep kernel exponential families (DKEF) [44] in Section 4.2 and nonlinear independent components estimation (NICE) [10] in Section 4.3 for density estimation. |
| Researcher Affiliation | Academia | Yuhao Huang1, Qingsong Wang1, Akwum Onwunta2, & Bao Wang1 1Department of Mathematics and Scientific Computing and Imaging (SCI) Institute University of Utah, Salt Lake City, UT 84102, USA 2Department of Industrial and Systems Engineering Lehigh University, Bethlehem, PA 18015, USA |
| Pseudocode | Yes | B IMPLEMENTATION OF DEQ Algorithm 1 Implementation of the fixed point iteration in Py Torch-style pseudocode. |
| Open Source Code | Yes | Second, we submitted the code in the supplementary materials to ensure the experimental results can be easily reproduced. |
| Open Datasets | Yes | We consider two image generation tasks: Celeb A [24] and Cifar10 [20]. Celeb A is a dataset that contains 64 64 3 color images that identify celebrity face attributes, and Cifar10 contains 10 classes of 32 32 3 color images. and We consider the density estimation on two datasets: UCI(Parkinson/Redwine/Whitewine) [11] and high-dimensional Gaussian. and In this experiment, we consider the MNIST dataset [9], which contains 60K 28 28 grayscale images. |
| Dataset Splits | Yes | Each dataset is split into train, validation, and test sets with 70%, 20%, and 10%, respectively. |
| Hardware Specification | No | No specific hardware details (like GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments were provided. |
| Software Dependencies | No | The paper mentions PyTorch [30] and Adam [17] but does not specify their version numbers or other software dependencies with version information. |
| Experiment Setup | Yes | We train the models using Adam [17], for 10^5 iterations, with learning rate 1e-4, weight decay 1e-12, and batch size 128. |