Nonsmooth Implicit Differentiation for Machine-Learning and Optimization
Authors: Jérôme Bolte, Tam Le, Edouard Pauwels, Tony Silveti-Falls
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide several applications such as training deep equilibrium networks, training neural nets with conic optimization layers, or hyperparameter-tuning for nonsmooth Lasso-type models. To show the sharpness of our assumptions, we present numerical experiments showcasing the extremely pathological gradient dynamics one can encounter when applying implicit algorithmic differentiation without any hypothesis. |
| Researcher Affiliation | Academia | Jérôme Bolte Toulouse School of Economics Univ. Toulouse Toulouse, France Tam Le Toulouse School of Economics Univ. Toulouse Toulouse, France Edouard Pauwels IRIT, CNRS Univ. Toulouse Toulouse, France Antonio Silveti-Falls Toulouse School of Economics Univ. Toulouse Toulouse, France |
| Pseudocode | No | The paper describes algorithmic processes such as backpropagation and gradient descent in narrative text, but it does not include any formal pseudocode blocks or clearly labeled algorithm sections. |
| Open Source Code | No | The paper references existing software libraries used (e.g., TensorFlow, PyTorch, JAX, cvxpylayers) but does not state that the authors' own implementation code for the presented methodology is open-source or provide a link to it. |
| Open Datasets | No | The numerical experiments are based on constructed mathematical problems and models (e.g., a bilevel problem, the Lorenz ODE system, Lasso problem formulation with parameters X and y), rather than publicly available datasets. Therefore, no access information for a dataset is provided. |
| Dataset Splits | No | The paper's numerical experiments are based on constructed mathematical models, not external datasets, so there are no mentions of training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., GPU models, CPU types, memory amounts, or cloud instance specifications) used for running the experiments. |
| Software Dependencies | No | The paper mentions software components such as “Tensor Flow [1], Py Torch [47], JAX [16]” and “cvxpylayers [2]” used in their work. However, it does not provide specific version numbers for these software dependencies, which would be necessary for reproducibility. |
| Experiment Setup | No | The paper describes the mathematical problems and general methods used for the numerical experiments (e.g., “implement gradient descent”, “perform gradient ascent”), but it does not provide specific experimental setup details such as concrete hyperparameter values (e.g., learning rates, batch sizes, number of epochs) or specific optimizer settings. |