reproducibilityindex.ai

Fundamental Tradeoffs in Distributionally Adversarial Training

Authors: Mohammad Mehrabi, Adel Javanmard, Ryan A. Rossi, Anup Rao, Tung Mai

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Figure 1 shows the effect of various parameters on the Pareto optimal tradeoffs between adversarial (AR) and standard risks (SR) in linear regression setting. We consider data generated according to the linear model y = x Tθ0 + w with w N(0, 1) and features xi sampled i.i.d from N(0, Σ) where Σi,j = ρ\|i j\|. Figure 1b investigates the role of dependency across features (ρ) in the optimal tradeoff between standard and adversarial risks. Figure 2 showcases the effect of different factors in a binary classiﬁcation setting on the Pareto-optimal tradeoff between standard and adversarial risks. The Pareto optimal points {SR(θλ), AR(θλ) : λ 0} are plotted in Figure 3
Researcher Affiliation	Collaboration	Mohammad Mehrabi * 1 Adel Javanmard * 1 Ryan A. Rossi 2 Anup B. Rao 2 Tung Mai 2 *Equal contribution 1Department of Data Sciences and Operations, USC Marshall School of Business, University of Southern California, USA; 2Adobe Research, USA. Correspondence to: Adel Javanmard <ajavanma@usc.edu>.
Pseudocode	No	The paper contains mathematical formulations and theoretical derivations, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement about releasing source code for the methodology described, nor does it include a link to a code repository.
Open Datasets	No	The paper primarily uses synthetic data models for its analysis and simulations, such as "data generated according to the linear model y = x Tθ0 + w" and "the features x are drawn from N(yµ, Σ)". It mentions generating "n = 500K samples of x Unif(Sd 1( d))", but it does not provide concrete access information (e.g., URL, DOI, specific citation to an established public dataset) for a publicly available or open dataset.
Dataset Splits	No	The paper mentions using "empirical loss with n = 500K samples" for computation but does not provide specific details on training/validation/test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) needed to reproduce the experiment's data partitioning.
Hardware Specification	No	The paper does not provide any specific hardware details (e.g., exact GPU/CPU models, processor types, memory amounts, or cloud instance specifications) used for running its experiments.
Software Dependencies	No	The paper does not provide any specific ancillary software details, such as library names with version numbers (e.g., Python 3.8, PyTorch 1.9), which are necessary to replicate the experiment environment.
Experiment Setup	No	While the paper specifies parameters for data generation (e.g., d=10, ε=0.2, σ=2) and mentions using "gradient descent" for optimization, it lacks crucial details for experimental setup such as specific hyperparameter values (e.g., learning rate, batch size, number of epochs) for the optimization process, which are necessary for full reproducibility.