Learning Distributions Generated by One-Layer ReLU Networks

Authors: Shanshan Wu, Alexandros G. Dimakis, Sujay Sanghavi

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results are provided to support our analysis. We empirically evaluate our algorithm in terms of its dependence over the number of samples, dimension, and condition number (Figure 1).
Researcher Affiliation Academia Shanshan Wu, Alexandros G. Dimakis, Sujay Sanghavi Department of Electrical and Computer Engineering University of Texas at Austin
Pseudocode Yes Algorithm 1: Learning a single-layer Re LU generative model. Algorithm 2: Norm Bias Est. Algorithm 3: Proj SGD.
Open Source Code Yes Code to reproduce our result8 can be found at https://github.com/wushanshan/ density Estimation.
Open Datasets No The paper mentions generating W and b as random matrices/vectors for experiments ('we generate W as a random orthonormal matrix; we generate b as a random normal vector'), implying synthetic data, but does not refer to a publicly available or open dataset.
Dataset Splits No The paper does not explicitly provide details about training, validation, or test dataset splits. Experiments are conducted on generated samples, but no specific partitioning strategy for reproducibility is mentioned.
Hardware Specification No The paper does not provide any specific hardware details such as GPU/CPU models, processors, or memory used for running the experiments.
Software Dependencies No The paper mentions hyper-parameters for the algorithms ('The hyper-parameters are B = 1 (in Algorithm 2), r = 3 and λ = 0.1 (in Algorithm 3)'), but does not list any specific software dependencies with version numbers.
Experiment Setup Yes The hyper-parameters are B = 1 (in Algorithm 2), r = 3 and λ = 0.1 (in Algorithm 3). Fix d = 5 and κ = 1. Middle: Fix n = 5 ⋅ 10^5 and κ = 1. Right: Fix n = 5 ⋅ 10^5 and d = 5. Every point shows the mean and standard deviation across 10 runs. Each run corresponds to a different W and b.