On Regularization and Inference with Label Constraints

Authors: Kaifu Wang, Hangfeng He, Tin D. Nguyen, Piyush Kumar, Dan Roth

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical To the best of our knowledge, our analysis is the first to provide a theoretical view on comparing the two approaches. We believe in the importance of this comparison and hope to bring this problem to the attention of the machine learning community. In summary, our contributions include: 1. We provide an error bound (Theorem 3.6) that describes the tradeoff between the generalization gap and the optimal risk when performing regularization with constraints. 2. We propose a sufficient and necessary condition (Theorem 4.3) for constrained inference to improve a model by quantifying its reduction in risk. Based on this, we further argue that constrained inference, when used at training time, implicitly modifies the training objective in an opposite direction as in the regularization approach (Proposition 4.6). 3. We study the combination of regularization and constrained inference, and propose sufficient (Theorem 5.1) as well as necessary (Theorem 5.2) conditions for the combined algorithm to achieve improvement in both optimal risk and model complexity.
Researcher Affiliation Collaboration 1University of Pennsylvania, Philadelphia, PA, USA 2University of Rochester, Rochester, NY, USA (Part of the work done while at the University of Pennsylvania.) 3Massachusetts Institute of Technology, Cambridge, MA, USA 4Systems and Technology Research, Woburn, MA USA.
Pseudocode No The paper focuses on theoretical analysis and mathematical proofs, and does not include any pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statements about releasing source code or links to code repositories for the described methodology.
Open Datasets No The paper is theoretical and does not describe experiments run on a specific publicly available dataset, nor does it provide access information for any dataset it might have hypothetically used for illustrations.
Dataset Splits No The paper is theoretical and does not provide specific details about training, test, or validation dataset splits, as it does not report on empirical experiments.
Hardware Specification No The paper is theoretical and does not describe any specific hardware used to run experiments.
Software Dependencies No The paper is theoretical and does not specify any software dependencies with version numbers used for implementation or experiments.
Experiment Setup No The paper is theoretical and does not provide details about an experimental setup, including hyperparameters or system-level training settings.