Direct Runge-Kutta Discretization Achieves Acceleration
Authors: Jingzhao Zhang, Aryan Mokhtari, Suvrit Sra, Ali Jadbabaie
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide numerical experiments that verify the theoretical rates predicted by our results. 5 Numerical experiments We perform numerical experiments to verify Theorem 1 and compare ODE direct discretizating (DD) methods described in Algorithm 1 against gradient descent (GD) and Nesterov s accelerated gradient(NAG) method. All figures in this section are on log-log scale. For each optimization method, we empirically choose the largest step size among {10 k|k Z} subject to that the algorithm remains stable in the first 1000 iterations. |
| Researcher Affiliation | Academia | Jingzhao Zhang LIDS Massachusetts Institute of Technology Cambridge, MA, 02139 jzhzhang@mit.edu Aryan Mokhtari LIDS Massachusetts Institute of Technology Cambridge, MA, 02139 aryanm@mit.edu Suvrit Sra LIDS, IDSS Massachusetts Institute of Technology Cambridge, MA, 02139 suvrit@mit.edu Ali Jadbabaie LIDS, IDSS Massachusetts Institute of Technology Cambridge, MA, 02139 jadbabai@mit.edu |
| Pseudocode | Yes | Algorithm 1: Input(f, x0, p, L, M, s, N) |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that open-source code for the methodology is provided. |
| Open Datasets | No | The paper mentions generating a "synthetic linearly separable dataset" and using "L4 loss" and "logistic loss" on "the same set of data points" but does not provide any concrete access information (link, DOI, specific repository, or formal citation with authors/year) to these datasets or details about their public availability. |
| Dataset Splits | No | The paper does not explicitly provide details about training/validation/test dataset splits, percentages, or sample counts. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory, or specific computing environments) used for running the experiments. |
| Software Dependencies | No | The paper does not specify version numbers for any software components, libraries, or solvers used in the experiments. |
| Experiment Setup | Yes | For each optimization method, we empirically choose the largest step size among {10 k|k Z} subject to that the algorithm remains stable in the first 1000 iterations. In particular, we discretize the ODE (11) for p = 2 with integrators of different orders, i.e., s {1, 2, 4} and compare them against GD and NAD. In Figure 1b, we empirically explore the convergence rate of discretizing the ODE x(t) + 2q+1 t x(t) + q2tq 2 f(x(t)) = 0, (22) when q = p. We minimize the same L2 loss with different values of q using a fourth order integrator with the same step size. |