Generalization Bounds with Minimal Dependency on Hypothesis Class via Distributionally Robust Optimization

Authors: Yibo Zeng, Henry Lam

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct simple experiments to study the numerical behaviors of our MMD DRO and compare with ERM on simulated data, which serves to illustrate the potential of MMD DRO in improving generalization and validate our developed theory. We adapt the experiment setups from [9, Section 5.2] and consider a quadratic loss with linear perturbation: l(θ, z) = 1/2 θ v 2 2 + z (θ v), where z Unif[-B, B]d with constant B varying from {1, 10, 100} in the experiment.
Researcher Affiliation Academia Yibo Zeng Columbia University yibo.zeng@columbia.edu Henry Lam Columbia University henry.lam@columbia.edu
Pseudocode No The paper does not contain any pseudocode or algorithm blocks. It focuses on mathematical derivations and proofs.
Open Source Code No The paper mentions using CVX [89] and MOSEK [90] but does not state that the authors' own code for the described methodology is open-source or publicly available.
Open Datasets No The paper uses 'simulated data' which is generated for the experiment, rather than a publicly available dataset with concrete access information. 'l(θ, z) = 1/2 θ v 2 2 + z (θ v), where z Unif[-B, B]d with constant B varying from {1, 10, 100} in the experiment.'
Dataset Splits No The paper describes generating simulated data and varying sample size 'n', but it does not specify explicit training, validation, or test splits for the data. It mentions 'future test data' in a general sense but not a formal split used in their numerical experiments.
Hardware Specification Yes Our computational environment is a Mac mini with Apple M1 chip, 8 GB RAM and all algorithms are implemented in Python 3.8.3.
Software Dependencies Yes Our computational environment is a Mac mini with Apple M1 chip, 8 GB RAM and all algorithms are implemented in Python 3.8.3. ... The sampled program is then solved by CVX [89] and MOSEK [90]. ... Version 9.2.46.
Experiment Setup Yes We adapt the experiment setups from [9, Section 5.2] and consider a quadratic loss with linear perturbation: l(θ, z) = 1/2 θ v 2 2 + z (θ v), where z Unif[-B, B]d with constant B varying from {1, 10, 100} in the experiment. ... In applying MMD DRO, we use Gaussian kernel k(z, z') = exp(- ||z - z'||^2 / σ^2) with σ set to the median of {||zi - zj||^2 | i, j} according to the well-known median heuristic [67]. ... tune the ball size via the best among η ∈ {0.01, 0.05, 0.1, 0.15, 0.2, 0.3, . . . , 1.0} ... d = 5.