reproducibilityindex.ai

Pareto GAN: Extending the Representational Power of GANs to Heavy-Tailed Distributions

Authors: Todd Huster, Jeremy Cohen, Zinan Lin, Kevin Chan, Charles Kamhoua, Nandi O. Leslie, Cho-Yu Jason Chiang, Vyas Sekar

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we evaluate our proposed approach on a variety of heavy-tailed datasets. 6. EXPERIMENTS
Researcher Affiliation	Collaboration	1Perspecta Labs, Basking Ridge, NJ, USA 2Carnegie Mellon University, Pittsburgh, Pennsylvania, USA 3Army Research Lab, Adelphi, Maryland, USA 4Raytheon Technologies, Adelphi, Maryland, USA.
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions using 'an open-source implementation6 of the kernel-type estimator' and provides a link to that external tool, but it does not state that its own Pareto GAN code is open source or provide a link for it.
Open Datasets	Yes	136 million keystrokes. This dataset includes interarrival times between keystrokes for a variety of users (Dhakal et al., 2018). Wikipedia Web trafﬁc. This dataset includes the daily number of daily views of various Wikipedia articles during 2015 and 20164. (https://www.kaggle.com/c/web-trafﬁc-time-seriesforecasting) SNAP Live Journal. This dataset consists of a network graph for the Live Journal social network (Leskovec et al., 2008). S&P 500 Daily Changes. This dataset consists of the daily prices of the S&P 500 stocks from 1999 through 20135. (Downloaded from https://quantquote.com/historical-stockdata, data has been recently removed)
Dataset Splits	Yes	We randomly partition the data into training, validation, and test sets. Training and validation each have a small fraction of the full dataset (<10%), while the remainder becomes the test set.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or cloud instance specifications used for running the experiments.
Software Dependencies	No	The paper mentions using 'scikitlearn' and 'Re LU activations', but it does not specify version numbers for these or other key software libraries or frameworks used for implementation (e.g., Python, PyTorch, TensorFlow).
Experiment Setup	Yes	The network consisted of four fully connected layers with 32 hidden units per layer and Re LU activations. We used a batch size of 256 in all cases. We vary learning rate from 10^-4 to 10^-6 and train for 20,000 iterations. For Pareto GAN, we use the dγ( , ) from deﬁnition 9, with γ = 2 on all datasets. The generator network consists of 4 fully connected layers with 256 units on each layer. The batch size is 256, and the number of training iterations is 200000.