Distributionally Robust Optimization with Data Geometry

Authors: Jiashuo Liu, Jiayun Wu, Bo Li, Peng Cui

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments justify the superiority of our GDRO to existing DRO methods in multiple settings with strong distributional shifts, and confirm that the uncertainty set of GDRO adapts to data geometry.
Researcher Affiliation Academia Jiashuo Liu1, , Jiayun Wu1,*, Bo Li2 , Peng Cui1, 1 Department of Computer Science & Technology, Tsinghua University, Beijing, China 2School of Economics and Management, Tsinghua University, Beijing, China
Pseudocode Yes Algorithm 1 Geometric Wasserstein Distributionally Robust Optimization (GDRO)
Open Source Code No 3. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [No]
Open Datasets Yes Ionosphere Radar Dataset. https://archive-beta.ics.uci.edu/ml/datasets/ionosphere.
Dataset Splits No The paper describes how training and testing data are generated or sampled for simulations and real-world datasets but does not explicitly state the train/validation/test splits (e.g., percentages or counts) for reproduction.
Hardware Specification No Simulate gradient flow in Equation 6 is implemented by message propagation with DGL package [32], which scales linearly with sample size and enjoys parallelization by GPU.
Software Dependencies No The simulation of gradient flow in Equation 6 is implemented by message propagation with DGL package [32], which scales linearly with sample size and enjoys parallelization by GPU.
Experiment Setup Yes Implementation Details For all experiments, G0 is constructed as a k-nearest neighbor graph from the training data only at the initialization step. Specifically, we adopt NN-Descent [10] to efficiently estimate the k-nearest neighbor graph... We adopt MSE as the empirical loss function for regression tasks and cross-entropy for classification tasks. We use MLPs for the Colored MNIST and Ionosphere datasets, and linear models in the other experiments. Besides, we find that the two-stage optimization is enough for good performances... Algorithm 1: Input: Training Dataset Dtr = {(xi, yi)}n i=1, learning rate αθ, gradient flow iterations T, entropy term β, manifold representation G0 (learned by k NN algorithm from Dtr).