Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Robust Fine-Tuning of Deep Neural Networks with Hessian-based Generalization Guarantees

Authors: Haotian Ju, Dongyue Li, Hongyang R Zhang

ICML 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform a detailed empirical study of our algorithm on various noisy environments and architectures. For example, on six image classification tasks whose training labels are generated with programmatic labeling, we show a 3.26% accuracy improvement over prior methods.
Researcher Affiliation Academia 1Northeastern University, Boston MA, United States.
Pseudocode Yes Algorithm 1 Consistent loss reweighting with layerwise projection
Open Source Code No The paper mentions 'For the baselines, we report the results from running their open-sourced implementations.' but does not state that the authors' own code for the described methodology is publicly available or provide a link.
Open Datasets Yes For image classification, we use six domains of object classification tasks from the Domain Net [PBX+19] dataset. [...] For text classification, we use the MRPC dataset from the GLUE benchmark [WSM+18]
Dataset Splits Yes Hyper-parameters in the fine-tuning algorithms are selected based on the accuracy of the validation dataset. [...] Table 6: Basic statistics for six datasets with noisy labels [MCS+21]. [...] Number of validation Samples
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies No The paper mentions software like 'Adam optimizer', 'Optuna [ASY+19] package', and 'Py Hessian [YGK+20]' but does not specify version numbers for these software components.
Experiment Setup Yes We use Adam optimizer with learning rate 1e-4 and decay the learning rate by 10 every 10 epochs. In the experiments on text classification datasets, we fine-tune the BERT-Base model for 5 epochs. We use Adam optimizer with an initial learning rate of 5e-4 and then linearly decay the learning rate. [...] We search the distance constraint parameter D in [0.05, 10] and the scaling parameter γ in [1, 5].