Tighter Expected Generalization Error Bounds via Wasserstein Distance
Authors: Borja Rodríguez Gálvez, German Bassi, Ragnar Thobaben, Mikael Skoglund
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Example 1 (Gaussian location model). Consider the problem of estimating the mean µ of a d-dimensional Gaussian distribution with known covariance matrix σ2Id. Further consider that there are n samples S = (Z1, . . . , Zn) available, the loss is measured with the Euclidean distance ℓ(w, z) = w z 2, and the estimation is their empirical mean W = 1 n Pn i=1 Zi. In this example, the expected generalization error can be calculated exactly (see Appendix E): gen(W, S) = dσ2/(2n). ... Figure 1: Expected generalization error and generalization error bounds for the Gaussian location model with N(µ, 1) (left) and N(µ, I250) (right). See Appendix E for the details. |
| Researcher Affiliation | Collaboration | Borja Rodríguez-Gálvez KTH Royal Institute of Technology Stockholm, Sweden borjarg@kth.se; Germán Bassi Ericsson Research Stockholm, Sweden german.bassi@ericsson.com; Ragnar Thobaben KTH Royal Institute of Technology Stockholm, Sweden ragnart@kth.se; Mikael Skoglund KTH Royal Institute of Technology Stockholm, Sweden skoglund@kth.se |
| Pseudocode | No | The paper describes mathematical proofs and outlines their steps but does not include any pseudocode or algorithm blocks with structured steps. |
| Open Source Code | No | The paper's checklist under “3. If you ran experiments...” explicitly states “[N/A]” for “Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)?”. No other statement about code release is found. |
| Open Datasets | No | The paper uses a theoretical “Gaussian location model” as an example for analytical calculations, which defines samples (Z_i) from a distribution (P_Z). It does not use or provide concrete access information (link, citation, repository) for a publicly available, named dataset. |
| Dataset Splits | No | The paper is theoretical and presents analytical calculations for a specific model; it does not involve empirical experiments with data splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and involves analytical derivations and calculations rather than empirical experiments, thus no hardware specifications for running experiments are mentioned. |
| Software Dependencies | No | The paper is theoretical and does not describe software used for its analysis, thus no software dependencies with version numbers are provided. |
| Experiment Setup | No | The paper is theoretical and presents analytical derivations and comparisons. It does not describe an experimental setup with specific hyperparameters, training configurations, or system-level settings, as no empirical experiments were conducted. |