Super-efficiency of automatic differentiation for functions defined as a minimum

Authors: Pierre Ablin, Gabriel Peyré, Thomas Moreau

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our analysis is backed by numerical experiments on toy problems and on Wasserstein barycenter computation. Finally, we provide numerical illustrations of the aforementioned results in Sec.5. All experiments are performed in Python using pytorch (Paszke et al., 2019). The code to reproduce the figures is available online.
Researcher Affiliation Academia Pierre Ablin 1 Gabriel Peyré 1 Thomas Moreau 2 1Department de Mathématique et Applications, ENS ULM, Paris, France 2Université Paris-Saclay, Inria, CEA, Palaiseau, 91120, France.
Pseudocode No No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code Yes All experiments are performed in Python using pytorch (Paszke et al., 2019). The code to reproduce the figures is available online.1 1See https://github.com/tom Moral/diffopt.
Open Datasets No The paper describes problem formulations (e.g., Ridge Regression, Regularized Logistic Regression, Least p-th norm, Regularized Wasserstein Distance) and their properties, but does not provide specific access information (links, DOIs, repositories, or formal citations with author/year) for publicly available datasets used in the experiments.
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) for training, validation, or test sets.
Hardware Specification No No specific hardware details (e.g., CPU/GPU models, memory, or specific computer specifications) used for running the experiments are mentioned in the paper.
Software Dependencies No The paper mentions 'Python using pytorch', but does not specify the version numbers for these software components.
Experiment Setup No The paper discusses parameters like 'step-size ρ' and 'regularization parameter λ' in the context of the algorithms and losses. However, it does not provide concrete hyperparameter values (e.g., specific learning rates, batch sizes, number of epochs) or comprehensive system-level training settings for all experiments, often using variable names or general conditions rather than fixed values.