Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Affine-Invariant Global Non-Asymptotic Convergence Analysis of BFGS under Self-Concordance

Authors: Qiujiang Jin, Aryan Mokhtari

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Next, we present numerical experiments applying BFGS to two functions satisfying Assumptions 2.2. We report our results using two different choices of initial Hessian approximation B0: (i) B0 = I, and (ii) B0 = c I, where c = s y / s 2, with s = x2 x1, y = f(x2) f(x1), where x1, x2 are randomly selected. The line search parameters are also set as α = 0.1 and β = 0.9. In our experiments, we also report the convergence paths of gradient descent (GD) and accelerated gradient descent (AGD), with step sizes determined using backtracking line search. The first function that we study is the cubic function from [47] i=1 g(v i x v i+1x) ω2v 1 x , where g(x) = 1 3|x|3 |x| , x2 2|x| + 1 3 3 |x| > .
Researcher Affiliation Collaboration Qiujiang Jin UT Austin EMAIL Aryan Mokhtari UT Austin & Google Research EMAIL
Pseudocode No The paper describes the BFGS update rule and other mathematical formulations but does not contain a clearly labeled pseudocode or algorithm block with structured steps.
Open Source Code Yes We have uploaded our Matlab codes which generate all the empirical results in the numerical experiments.
Open Datasets No The first function that we study is the cubic function from [47] i=1 g(v i x v i+1x) ω2v 1 x , where g(x) = 1 3|x|3 |x| , x2 2|x| + 1 3 3 |x| > . The second loss is the logistic regression: f(x) = 1 N PN i=1 ln (1 + e yiz i x), where {zi}N i=1 are the data points and {yi}N i=1 are their corresponding labels. We assume that zi Rd generated with standard normal distribution and yi { 1, 1} generated with uniform distribution for all 1 i N. We choose the number of data points as N = d.
Dataset Splits No The paper uses synthetically generated data for its experiments (hard cubic function and logistic regression with generated zi and yi). As such, there is no mention of explicit training, validation, or test dataset splits typically associated with pre-existing datasets.
Hardware Specification Yes We only need to install the Matlab software on our personal computer with normal CPU to run our codes and reproduce the experiments, as we do not run any form of large-scale training.
Software Dependencies No The paper mentions 'Matlab software' but does not specify a version number or any other software dependencies with version numbers.
Experiment Setup Yes We report our results using two different choices of initial Hessian approximation B0: (i) B0 = I, and (ii) B0 = c I, where c = s y / s 2, with s = x2 x1, y = f(x2) f(x1), where x1, x2 are randomly selected. The line search parameters are also set as α = 0.1 and β = 0.9. In our experiments, we also report the convergence paths of gradient descent (GD) and accelerated gradient descent (AGD), with step sizes determined using backtracking line search.