Robust Generalization despite Distribution Shift via Minimum Discriminating Information

Authors: Tobias Sutter, Andreas Krause, Daniel Kuhn

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Lastly, we demonstrate the versatility of our framework by demonstrating it on two rather distinct applications: (1) training classifiers on systematically biased data and (2) off-policy evaluation in Markov Decision Processes. 6 Experimental results We now assess the empirical performance of the MDI-DRO method in our two running examples.1
Researcher Affiliation Academia Tobias Sutter University of Konstanz, Germany tobias.sutter@uni-konstanz.de Andreas Krause ETH Zurich, Switzerland krausea@ethz.ch Daniel Kuhn EPFL, Switzerland daniel.kuhn@epfl.ch
Pseudocode Yes Algorithm 1: Fast gradient method for smooth & strongly convex optimization [47]
Open Source Code Yes All simulations were implemented in MATLAB and run on a 4GHz CPU with 16Gb RAM. The Matlab code for reproducing the plots is available from https://github.com/tobsutter/PMDI_DRO.
Open Datasets Yes The second experiment addresses the heart disease classification task of Example 3.2 based on a real dataset2 consisting of N i.i.d. samples from an unknown test distribution P . [...] 2https://www.kaggle.com/ronitf/heart-disease-uci
Dataset Splits No The paper mentions 'training data' and 'test distribution' but does not provide specific details on how the dataset was split into training, validation, and test sets, either by percentages, counts, or by referencing standard splits.
Hardware Specification Yes All simulations were implemented in MATLAB and run on a 4GHz CPU with 16Gb RAM.
Software Dependencies No The paper states 'All simulations were implemented in MATLAB' but does not provide specific version numbers for MATLAB or any other software libraries/dependencies.
Experiment Setup No The paper describes the problem settings and compares different methods but does not provide specific hyperparameter values (e.g., learning rate, batch size) or detailed training configurations for reproducibility within the main text.