Rate-Distortion Analysis of Minimum Excess Risk in Bayesian Learning
Authors: Hassan Hafez-Kolahi, Behrad Moniri, Shohreh Kasaei, Mahdieh Soleymani Baghshah
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we build upon and extend the recent results of (Xu & Raginsky, 2020) to analyze the MER in Bayesian learning and derive information-theoretic bounds on it. We formulate the problem as a (constrained) rate-distortion optimization and show how the solution can be bounded above and below by two other rate-distortion functions that are easier to study. The lower bound represents the minimum possible excess risk achievable by any process using R bits of information from the parameter W. For the upper bound, the optimization is further constrained to use R bits from the training set, a setting which relates MER to information-theoretic bounds on the generalization gap in frequentist learning. We derive information-theoretic bounds on the difference between these upper and lower bounds and show that they can provide order-wise tight rates for MER under certain conditions. This analysis gives more insight into the information-theoretic nature of Bayesian learning as well as providing novel bounds. |
| Researcher Affiliation | Academia | 1Department of Computer Engineering, Sharif University of Technology, Tehran, Iran 2Department of Electrical Engineering, Sharif University of Technology, Tehran, Iran. |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not mention providing open-source code for the methodology described. |
| Open Datasets | No | The paper is theoretical and does not conduct experiments with datasets. It discusses theoretical models such as Gaussian Location Model and Linear Regression but does not use empirical data. |
| Dataset Splits | No | The paper is theoretical and does not involve dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not mention any hardware specifications used for experiments. |
| Software Dependencies | No | The paper is theoretical and does not list any specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup with hyperparameters or training configurations. |