Continuous-time Lower Bounds for Gradient-based Algorithms

Authors: Michael Muehlebach, Michael Jordan

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This section illustrates the results from Sec. 3 on a simulation example. We choose Cµ,1 to be the set of all scalar functions that have a single non-degenerate minimum at the origin and whose Hessian is constant on the intervals I1 = ( , 4.5), I2 = ( 4.5, 3.5), . . . , I9 = (4.5, ). The Hessian may change on the interval boundaries and takes the values summarized in Table 1. Thus, the set Cµ,1 includes certain nonconvex functions and satisfies the Assumptions (C1)-(C3). In addition, it is straightforward to generate random functions f Cµ,1, by uniformly sampling potential values of the Hessian until the resulting function has a single local minimum. We evaluate and compare the performance of two algorithms.
Researcher Affiliation Academia Michael Muehlebach 1 Michael I. Jordan 1 1Division of Electrical Engineering and Computer Science, and Department of Statistics, University of California, Berkeley, Berkeley, USA.
Pseudocode No The paper provides mathematical equations describing algorithms (e.g., (8), (20), (26), (27)) but does not include any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide any explicit statement about releasing code or links to a code repository for the described methodology.
Open Datasets No The paper states: 'It is straightforward to generate random functions f Cµ,1, by uniformly sampling potential values of the Hessian until the resulting function has a single local minimum.' However, it does not provide concrete access information (link, DOI, formal citation) for the generated data, which is custom to this paper.
Dataset Splits No The paper describes simulation experiments: 'Both algorithms are evaluated on 50 randomly generated functions f Cµ,1, and for each f, 50 simulations with randomized initial conditions are performed.' It does not refer to standard training, validation, or test dataset splits in the conventional machine learning sense, as it generates functions dynamically rather than using a fixed, partitioned dataset.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions that 'The trajectories are simulated with the standard fourth-order Runge-Kutta method with a time step of 0.01' but does not provide specific version numbers for any software dependencies, libraries, or programming languages used.
Experiment Setup Yes The trajectories are simulated with the standard fourth-order Runge-Kutta method with a time step of 0.01. Both algorithms are evaluated on 50 randomly generated functions f Cµ,1, and for each f, 50 simulations with randomized initial conditions are performed. The initial conditions are sampled from independent normal distributions with zero mean and standard deviation 4.5 (motivated by the interval boundary I1 and I9). Each simulation terminates once a tolerance of |zsim(Tmax)| <= 10^-8 is reached.