Hamiltonian descent for composite objectives
Authors: Brendan O'Donoghue, Chris J. Maddison
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conclude with some numerical examples that show the promise of our approach. We give an example where our technique can solve a convex quadratic minimization problem orders of magnitude faster than several commonly-used gradient methods, including conjugate gradient, when the conditioning of the problem is poor. ... In this section we present two numerical examples where we compare the explicit discretization of Hamiltonian descent flow to gradient descent. |
| Researcher Affiliation | Industry | Brendan O Donoghue Deep Mind Chris J. Maddison Deep Mind / University of Oxford |
| Pseudocode | No | The paper describes algorithms (discretizations) in mathematical notation but does not provide any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any links to open-source code for the described methodology or an explicit statement of code release. |
| Open Datasets | No | The paper uses generated data for numerical examples ("randomly generate a nonsingular matrix M", "data were generated randomly"), but does not specify if these datasets are publicly available or provide access information for them. |
| Dataset Splits | No | The paper describes experiments but does not explicitly detail training, validation, and test splits for the data used in its numerical examples. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using a "convex cone solver SCS" but does not specify its version number or any other software dependencies with version details. |
| Experiment Setup | Yes | We chose m = n = 1000 and for simplicity we chose B = I, λ = 1, and randomly generated each entry in A to be IID N(0, 1). The best step size was chosen via exhaustive search for all three algorithms. The matrix M was randomly generated but chosen in such a way so as to be close to the identity. ... We chose dimension m = 500 and n = 1000 data points and we set λ1 = λ2 = 0.01. The data were generated randomly, and then perturbed so as to give a high condition number, which was 1.0 × 10^8. The best step size for both algorithms was found using exhaustive search. |