reproducibilityindex.ai

Learning with Differentiable Pertubed Optimizers

Authors: Quentin Berthet, Mathieu Blondel, Olivier Teboul, Marco Cuturi, Jean-Philippe Vert, Francis Bach

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate experimentally the performance of our approach on various tasks. and 5 Experiments We demonstrate the usefulness of perturbed maximizers in a supervised learning setting, as described in Section 4. We focus on a classiﬁcation task and on two structured prediction tasks, label ranking and learning to predict shortest paths.
Researcher Affiliation	Collaboration	Quentin Berthet Google Research, Brain Team Paris, France qberthet@google.com Mathieu Blondel Google Research, Brain Team Paris, France mblondel@google.com Olivier Teboul Google Research, Brain Team Paris, France oliviert@google.com Marco Cuturi Google Research, Brain Team Paris, France cuturi@google.com Jean-Philippe Vert Google Research, Brain Team Paris, France jpvert@google.com Francis Bach INRIA DI, ENS, PSL Research University Paris francis.bach@inria.fr
Pseudocode	No	The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	We will open-source a Python package allowing to turn any black-box solver into a differentiable function, in just a few lines of code.
Open Datasets	Yes	We use the perturbed argmax with Gaussian noise in an image classiﬁcation task on the CIFAR-10 dataset. and We use the same 21 datasets as in [28, 14].
Dataset Splits	Yes	Results are averaged over 10-fold CV and parameters tuned by 5-fold CV.
Hardware Specification	No	The paper mentions "In our experiments on GPU" but does not specify any particular hardware models or specifications.
Software Dependencies	No	The paper mentions "a Python package" but does not specify any software names with version numbers for reproducibility.
Experiment Setup	Yes	We train a vanilla-CNN with 10 network outputs that are the entries of θ, we minimize the Fenchel-Young loss between θi gwpxiq and yi, with different temperatures ε and number of perturbations M. and We optimize over 50 epochs with batches of size 70, temperature ε 1 and M 1 (single perturbation).