The Curse of Unrolling: Rate of Differentiating Through Optimization
Authors: Damien Scieur, Gauthier Gidel, Quentin Bertrand, Fabian Pedregosa
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5.1 Experiments on least squares objective. We compare multiple algorithms for estimating the Jacobian (OPT) of the solution of a ridge regression problem (Example (1)) for a fixed value of θ = 10−3. Figure 1 shows the objective and Jacobian suboptimality on a ridge regression problem with the breast-cancer2 as underlying dataset. Figure 4 shows the Jacobian suboptimality as a function of the number of iterations, on both the breast-cancer and bodyfat3 dataset, and for a synthetic dataset (where H(θ) is generated as A A, where each entry in A is generated from a standard Gaussian distribution). Appendix B contains further details and experiments on a logistic regression objective. |
| Researcher Affiliation | Collaboration | Damien Scieur Samsung SAIL Montreal damien.scieur@gmail.com Quentin Bertrand Mila & Universtié de Montréal quentin.bertrand@mila.quebec Gauthier Gidel Mila & Université de Montréal Canada CIFAR AI Chair gidelgau@mila.quebec Fabian Pedregosa Google Research pedregosa@google.com |
| Pseudocode | No | The paper describes algorithms using mathematical equations and text, but it does not include a clearly labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | No | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [No] |
| Open Datasets | Yes | Figure 1 shows the objective and Jacobian suboptimality on a ridge regression problem with the breast-cancer dataset... Figures 4...on both the breast-cancer and bodyfat dataset...2https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic) 3http://lib.stat.cmu.edu/datasets/ |
| Dataset Splits | No | The paper uses datasets but does not explicitly provide specific train/validation/test split percentages, sample counts, or a detailed splitting methodology. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments, such as GPU or CPU models, or cloud computing instance types with their specifications. |
| Software Dependencies | No | The paper does not mention specific software names with version numbers (e.g., Python 3.8, PyTorch 1.9, TensorFlow 2.x) that would be needed to replicate the experiments. |
| Experiment Setup | Yes | We compare multiple algorithms for estimating the Jacobian (OPT) of the solution of a ridge regression problem (Example (1)) for a fixed value of θ = 10−3. The non-asymptotic algorithm is rather complicated to implement; see Appendix D. Moreover, it requires a bound on the spectrum of H(θ), namely [ℓ, L], and one also has to choose an associated expected spectral density µ(λ) (parametrized by α) and the parameter η. ... The featured two-phase curve was computed using the step-size with a fastest asymptotic rate, computed through a grid-search on the step-size values. |