Fundamental Tradeoffs in Distributionally Adversarial Training

Authors: Mohammad Mehrabi, Adel Javanmard, Ryan A. Rossi, Anup Rao, Tung Mai

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Figure 1 shows the effect of various parameters on the Pareto optimal tradeoffs between adversarial (AR) and standard risks (SR) in linear regression setting. We consider data generated according to the linear model y = x Tθ0 + w with w N(0, 1) and features xi sampled i.i.d from N(0, Σ) where Σi,j = ρ|i j|. Figure 1b investigates the role of dependency across features (ρ) in the optimal tradeoff between standard and adversarial risks. Figure 2 showcases the effect of different factors in a binary classification setting on the Pareto-optimal tradeoff between standard and adversarial risks. The Pareto optimal points {SR(θλ), AR(θλ) : λ 0} are plotted in Figure 3
Researcher Affiliation Collaboration Mohammad Mehrabi * 1 Adel Javanmard * 1 Ryan A. Rossi 2 Anup B. Rao 2 Tung Mai 2 *Equal contribution 1Department of Data Sciences and Operations, USC Marshall School of Business, University of Southern California, USA; 2Adobe Research, USA. Correspondence to: Adel Javanmard <ajavanma@usc.edu>.
Pseudocode No The paper contains mathematical formulations and theoretical derivations, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement about releasing source code for the methodology described, nor does it include a link to a code repository.
Open Datasets No The paper primarily uses synthetic data models for its analysis and simulations, such as "data generated according to the linear model y = x Tθ0 + w" and "the features x are drawn from N(yµ, Σ)". It mentions generating "n = 500K samples of x Unif(Sd 1( d))", but it does not provide concrete access information (e.g., URL, DOI, specific citation to an established public dataset) for a publicly available or open dataset.
Dataset Splits No The paper mentions using "empirical loss with n = 500K samples" for computation but does not provide specific details on training/validation/test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) needed to reproduce the experiment's data partitioning.
Hardware Specification No The paper does not provide any specific hardware details (e.g., exact GPU/CPU models, processor types, memory amounts, or cloud instance specifications) used for running its experiments.
Software Dependencies No The paper does not provide any specific ancillary software details, such as library names with version numbers (e.g., Python 3.8, PyTorch 1.9), which are necessary to replicate the experiment environment.
Experiment Setup No While the paper specifies parameters for data generation (e.g., d=10, ε=0.2, σ=2) and mentions using "gradient descent" for optimization, it lacks crucial details for experimental setup such as specific hyperparameter values (e.g., learning rate, batch size, number of epochs) for the optimization process, which are necessary for full reproducibility.