Robust Generalization despite Distribution Shift via Minimum Discriminating Information
Authors: Tobias Sutter, Andreas Krause, Daniel Kuhn
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Lastly, we demonstrate the versatility of our framework by demonstrating it on two rather distinct applications: (1) training classifiers on systematically biased data and (2) off-policy evaluation in Markov Decision Processes. 6 Experimental results We now assess the empirical performance of the MDI-DRO method in our two running examples.1 |
| Researcher Affiliation | Academia | Tobias Sutter University of Konstanz, Germany tobias.sutter@uni-konstanz.de Andreas Krause ETH Zurich, Switzerland krausea@ethz.ch Daniel Kuhn EPFL, Switzerland daniel.kuhn@epfl.ch |
| Pseudocode | Yes | Algorithm 1: Fast gradient method for smooth & strongly convex optimization [47] |
| Open Source Code | Yes | All simulations were implemented in MATLAB and run on a 4GHz CPU with 16Gb RAM. The Matlab code for reproducing the plots is available from https://github.com/tobsutter/PMDI_DRO. |
| Open Datasets | Yes | The second experiment addresses the heart disease classification task of Example 3.2 based on a real dataset2 consisting of N i.i.d. samples from an unknown test distribution P . [...] 2https://www.kaggle.com/ronitf/heart-disease-uci |
| Dataset Splits | No | The paper mentions 'training data' and 'test distribution' but does not provide specific details on how the dataset was split into training, validation, and test sets, either by percentages, counts, or by referencing standard splits. |
| Hardware Specification | Yes | All simulations were implemented in MATLAB and run on a 4GHz CPU with 16Gb RAM. |
| Software Dependencies | No | The paper states 'All simulations were implemented in MATLAB' but does not provide specific version numbers for MATLAB or any other software libraries/dependencies. |
| Experiment Setup | No | The paper describes the problem settings and compares different methods but does not provide specific hyperparameter values (e.g., learning rate, batch size) or detailed training configurations for reproducibility within the main text. |