Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Affine-Invariant Global Non-Asymptotic Convergence Analysis of BFGS under Self-Concordance
Authors: Qiujiang Jin, Aryan Mokhtari
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Next, we present numerical experiments applying BFGS to two functions satisfying Assumptions 2.2. We report our results using two different choices of initial Hessian approximation B0: (i) B0 = I, and (ii) B0 = c I, where c = s y / s 2, with s = x2 x1, y = f(x2) f(x1), where x1, x2 are randomly selected. The line search parameters are also set as α = 0.1 and β = 0.9. In our experiments, we also report the convergence paths of gradient descent (GD) and accelerated gradient descent (AGD), with step sizes determined using backtracking line search. The first function that we study is the cubic function from [47] i=1 g(v i x v i+1x) ω2v 1 x , where g(x) = 1 3|x|3 |x| , x2 2|x| + 1 3 3 |x| > . |
| Researcher Affiliation | Collaboration | Qiujiang Jin UT Austin EMAIL Aryan Mokhtari UT Austin & Google Research EMAIL |
| Pseudocode | No | The paper describes the BFGS update rule and other mathematical formulations but does not contain a clearly labeled pseudocode or algorithm block with structured steps. |
| Open Source Code | Yes | We have uploaded our Matlab codes which generate all the empirical results in the numerical experiments. |
| Open Datasets | No | The first function that we study is the cubic function from [47] i=1 g(v i x v i+1x) ω2v 1 x , where g(x) = 1 3|x|3 |x| , x2 2|x| + 1 3 3 |x| > . The second loss is the logistic regression: f(x) = 1 N PN i=1 ln (1 + e yiz i x), where {zi}N i=1 are the data points and {yi}N i=1 are their corresponding labels. We assume that zi Rd generated with standard normal distribution and yi { 1, 1} generated with uniform distribution for all 1 i N. We choose the number of data points as N = d. |
| Dataset Splits | No | The paper uses synthetically generated data for its experiments (hard cubic function and logistic regression with generated zi and yi). As such, there is no mention of explicit training, validation, or test dataset splits typically associated with pre-existing datasets. |
| Hardware Specification | Yes | We only need to install the Matlab software on our personal computer with normal CPU to run our codes and reproduce the experiments, as we do not run any form of large-scale training. |
| Software Dependencies | No | The paper mentions 'Matlab software' but does not specify a version number or any other software dependencies with version numbers. |
| Experiment Setup | Yes | We report our results using two different choices of initial Hessian approximation B0: (i) B0 = I, and (ii) B0 = c I, where c = s y / s 2, with s = x2 x1, y = f(x2) f(x1), where x1, x2 are randomly selected. The line search parameters are also set as α = 0.1 and β = 0.9. In our experiments, we also report the convergence paths of gradient descent (GD) and accelerated gradient descent (AGD), with step sizes determined using backtracking line search. |