SGD Learns One-Layer Networks in WGANs

Authors: Qi Lei, Jason Lee, Alex Dimakis, Constantinos Daskalakis

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we provide simple experimental results to validate the performance of stochastic gradient descent ascent and provide experimental support for our theory. In Figure 1 we plot the relative error for parameter estimation decrease over the increasing sample complexity.
Researcher Affiliation Academia 1University of Texas at Austin, TX 2Princeton University, NJ 3Massachusetts Institute of Technology, MA.
Pseudocode Yes Algorithm 1 Online stochastic gradient descent ascent on WGAN
Open Source Code No The paper does not explicitly state that open-source code for the described methodology is available, nor does it provide a link to a code repository.
Open Datasets No The paper uses generated samples from a 'target distribution' (x D = GA (N(0, Ik0 k0))) and synthetic data for its experiments, as described in Section 3 and Algorithm 1 ('n training samples: x1, x2, ..., xn, where each xi ~ phi(A*z), z ~ N(0, Ik*k)'). It does not use a named, publicly available dataset with concrete access information like a URL or citation.
Dataset Splits No The paper does not explicitly mention the use of a validation set or specific dataset splits for validation. The experiments focus on training and evaluating recovery on generated samples.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory specifications, or cloud instance types) used to run the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., programming languages, libraries, or frameworks with their versions) that would be needed to replicate the experiments.
Experiment Setup Yes We fix the hidden dimension k = 2, and vary the output dimension over {3, 5, 7} and sample complexity over {500, 1000, 2000, 5000, 10000}. To visually demonstrate the learning process, we also include a simple comparison for different φ: i.e. leaky Re LU and tanh activations, when k = 1 and d = 2... Each arrow indicates the progress of 500 iteration steps.