Nonsmooth Implicit Differentiation for Machine-Learning and Optimization

Authors: Jérôme Bolte, Tam Le, Edouard Pauwels, Tony Silveti-Falls

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide several applications such as training deep equilibrium networks, training neural nets with conic optimization layers, or hyperparameter-tuning for nonsmooth Lasso-type models. To show the sharpness of our assumptions, we present numerical experiments showcasing the extremely pathological gradient dynamics one can encounter when applying implicit algorithmic differentiation without any hypothesis.
Researcher Affiliation Academia Jérôme Bolte Toulouse School of Economics Univ. Toulouse Toulouse, France Tam Le Toulouse School of Economics Univ. Toulouse Toulouse, France Edouard Pauwels IRIT, CNRS Univ. Toulouse Toulouse, France Antonio Silveti-Falls Toulouse School of Economics Univ. Toulouse Toulouse, France
Pseudocode No The paper describes algorithmic processes such as backpropagation and gradient descent in narrative text, but it does not include any formal pseudocode blocks or clearly labeled algorithm sections.
Open Source Code No The paper references existing software libraries used (e.g., TensorFlow, PyTorch, JAX, cvxpylayers) but does not state that the authors' own implementation code for the presented methodology is open-source or provide a link to it.
Open Datasets No The numerical experiments are based on constructed mathematical problems and models (e.g., a bilevel problem, the Lorenz ODE system, Lasso problem formulation with parameters X and y), rather than publicly available datasets. Therefore, no access information for a dataset is provided.
Dataset Splits No The paper's numerical experiments are based on constructed mathematical models, not external datasets, so there are no mentions of training, validation, or test dataset splits.
Hardware Specification No The paper does not provide any specific hardware details (e.g., GPU models, CPU types, memory amounts, or cloud instance specifications) used for running the experiments.
Software Dependencies No The paper mentions software components such as “Tensor Flow [1], Py Torch [47], JAX [16]” and “cvxpylayers [2]” used in their work. However, it does not provide specific version numbers for these software dependencies, which would be necessary for reproducibility.
Experiment Setup No The paper describes the mathematical problems and general methods used for the numerical experiments (e.g., “implement gradient descent”, “perform gradient ascent”), but it does not provide specific experimental setup details such as concrete hyperparameter values (e.g., learning rates, batch sizes, number of epochs) or specific optimizer settings.