On Regularization and Inference with Label Constraints
Authors: Kaifu Wang, Hangfeng He, Tin D. Nguyen, Piyush Kumar, Dan Roth
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | To the best of our knowledge, our analysis is the first to provide a theoretical view on comparing the two approaches. We believe in the importance of this comparison and hope to bring this problem to the attention of the machine learning community. In summary, our contributions include: 1. We provide an error bound (Theorem 3.6) that describes the tradeoff between the generalization gap and the optimal risk when performing regularization with constraints. 2. We propose a sufficient and necessary condition (Theorem 4.3) for constrained inference to improve a model by quantifying its reduction in risk. Based on this, we further argue that constrained inference, when used at training time, implicitly modifies the training objective in an opposite direction as in the regularization approach (Proposition 4.6). 3. We study the combination of regularization and constrained inference, and propose sufficient (Theorem 5.1) as well as necessary (Theorem 5.2) conditions for the combined algorithm to achieve improvement in both optimal risk and model complexity. |
| Researcher Affiliation | Collaboration | 1University of Pennsylvania, Philadelphia, PA, USA 2University of Rochester, Rochester, NY, USA (Part of the work done while at the University of Pennsylvania.) 3Massachusetts Institute of Technology, Cambridge, MA, USA 4Systems and Technology Research, Woburn, MA USA. |
| Pseudocode | No | The paper focuses on theoretical analysis and mathematical proofs, and does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to code repositories for the described methodology. |
| Open Datasets | No | The paper is theoretical and does not describe experiments run on a specific publicly available dataset, nor does it provide access information for any dataset it might have hypothetically used for illustrations. |
| Dataset Splits | No | The paper is theoretical and does not provide specific details about training, test, or validation dataset splits, as it does not report on empirical experiments. |
| Hardware Specification | No | The paper is theoretical and does not describe any specific hardware used to run experiments. |
| Software Dependencies | No | The paper is theoretical and does not specify any software dependencies with version numbers used for implementation or experiments. |
| Experiment Setup | No | The paper is theoretical and does not provide details about an experimental setup, including hyperparameters or system-level training settings. |