SGD Learns One-Layer Networks in WGANs
Authors: Qi Lei, Jason Lee, Alex Dimakis, Constantinos Daskalakis
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we provide simple experimental results to validate the performance of stochastic gradient descent ascent and provide experimental support for our theory. In Figure 1 we plot the relative error for parameter estimation decrease over the increasing sample complexity. |
| Researcher Affiliation | Academia | 1University of Texas at Austin, TX 2Princeton University, NJ 3Massachusetts Institute of Technology, MA. |
| Pseudocode | Yes | Algorithm 1 Online stochastic gradient descent ascent on WGAN |
| Open Source Code | No | The paper does not explicitly state that open-source code for the described methodology is available, nor does it provide a link to a code repository. |
| Open Datasets | No | The paper uses generated samples from a 'target distribution' (x D = GA (N(0, Ik0 k0))) and synthetic data for its experiments, as described in Section 3 and Algorithm 1 ('n training samples: x1, x2, ..., xn, where each xi ~ phi(A*z), z ~ N(0, Ik*k)'). It does not use a named, publicly available dataset with concrete access information like a URL or citation. |
| Dataset Splits | No | The paper does not explicitly mention the use of a validation set or specific dataset splits for validation. The experiments focus on training and evaluating recovery on generated samples. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory specifications, or cloud instance types) used to run the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., programming languages, libraries, or frameworks with their versions) that would be needed to replicate the experiments. |
| Experiment Setup | Yes | We fix the hidden dimension k = 2, and vary the output dimension over {3, 5, 7} and sample complexity over {500, 1000, 2000, 5000, 10000}. To visually demonstrate the learning process, we also include a simple comparison for different φ: i.e. leaky Re LU and tanh activations, when k = 1 and d = 2... Each arrow indicates the progress of 500 iteration steps. |