Pareto GAN: Extending the Representational Power of GANs to Heavy-Tailed Distributions
Authors: Todd Huster, Jeremy Cohen, Zinan Lin, Kevin Chan, Charles Kamhoua, Nandi O. Leslie, Cho-Yu Jason Chiang, Vyas Sekar
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we evaluate our proposed approach on a variety of heavy-tailed datasets. 6. EXPERIMENTS |
| Researcher Affiliation | Collaboration | 1Perspecta Labs, Basking Ridge, NJ, USA 2Carnegie Mellon University, Pittsburgh, Pennsylvania, USA 3Army Research Lab, Adelphi, Maryland, USA 4Raytheon Technologies, Adelphi, Maryland, USA. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions using 'an open-source implementation6 of the kernel-type estimator' and provides a link to that external tool, but it does not state that its own Pareto GAN code is open source or provide a link for it. |
| Open Datasets | Yes | 136 million keystrokes. This dataset includes interarrival times between keystrokes for a variety of users (Dhakal et al., 2018). Wikipedia Web traffic. This dataset includes the daily number of daily views of various Wikipedia articles during 2015 and 20164. (https://www.kaggle.com/c/web-traffic-time-seriesforecasting) SNAP Live Journal. This dataset consists of a network graph for the Live Journal social network (Leskovec et al., 2008). S&P 500 Daily Changes. This dataset consists of the daily prices of the S&P 500 stocks from 1999 through 20135. (Downloaded from https://quantquote.com/historical-stockdata, data has been recently removed) |
| Dataset Splits | Yes | We randomly partition the data into training, validation, and test sets. Training and validation each have a small fraction of the full dataset (<10%), while the remainder becomes the test set. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or cloud instance specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'scikitlearn' and 'Re LU activations', but it does not specify version numbers for these or other key software libraries or frameworks used for implementation (e.g., Python, PyTorch, TensorFlow). |
| Experiment Setup | Yes | The network consisted of four fully connected layers with 32 hidden units per layer and Re LU activations. We used a batch size of 256 in all cases. We vary learning rate from 10^-4 to 10^-6 and train for 20,000 iterations. For Pareto GAN, we use the dγ( , ) from definition 9, with γ = 2 on all datasets. The generator network consists of 4 fully connected layers with 256 units on each layer. The batch size is 256, and the number of training iterations is 200000. |