Backpropagation through Combinatorial Algorithms: Identity with Projection Works
Authors: Subham Sekhar Sahoo, Anselm Paulus, Marin Vlastelica, Vít Musil, Volodymyr Kuleshov, Georg Martius
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate that such a straightforward hyperparameter-free approach is able to compete with previous more complex methods on numerous experiments such as backpropagation through discrete samplers, deep graph matching, and image retrieval. |
| Researcher Affiliation | Academia | Subham Sekhar Sahoo Cornell University Ithaca, USA ssahoo@cs.cornell.edu Anselm Paulus MPI Intelligent Systems Tübingen, Germany anselm.paulus@tue.mpg.de Marin Vlastelica MPI Intelligent Systems Tübingen, Germany marin.vlastelica@tue.mpg.de Vít Musil Masaryk University, FI Brno, Czech Republic musil@fi.muni.cz Volodymyr Kuleshov Cornell Tech NYC, USA kuleshov@cornell.edu Georg Martius MPI Intelligent Systems Tübingen, Germany georg.martius@tue.mpg.de |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at github.com/martius-lab/solver-differentiation-identity. ... A curated Github repository for reproducing all results is available at github.com/martius-lab/solver-differentiation-identity. |
| Open Datasets | Yes | The datasets used in the experiments, i.e. Beer Advocate (Mc Auley et al., 2012), MNIST (Le Cun et al., 2010), SPair-71k (Min et al., 2019), Globe TSP and Warcraft Shortest Path (Vlastelica et al., 2020), CUB-200-2011 (Welinder et al., 2010) are publicly available. |
| Dataset Splits | Yes | We used MNIST (Le Cun et al., 2010) dataset for the problem which consisted of 50, 000 training examples and 10, 000 validation and test examples each. ... The SPair-71k dataset consists of 70, 958 annotated image pairs, with images from the Pascal VOC 2012 and Pascal 3D+ datasets. It comes with a pre-defined train validation test split of 53, 340 5, 384 12, 234. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models or processor types used for running its experiments. It mentions using neural networks and solvers, implying computational resources, but without explicit specifications. |
| Software Dependencies | Yes | We use the gurobi (Gurobi Optimization, LLC, 2022) solver for the MIP formulation of TSP. |
| Experiment Setup | Yes | We train the model for 100 epochs... We train all models for 10 epochs of 2000 training iterations each, with image pairs processed in batches of 8. ... We use the Adam optimizer with an initial learning rate of 2 10 3 which is halved every 2 epochs. The learning rate for finetuning the VGG weights is multiplied by 10 2. ... We train all models for 80 epochs using the Adam optimizer, using a learning rate of 5 10 7... In all experiments a weight decay of 4 10 4 is used, as well as a drop of the learning rate by 70% after 35 epochs. The learning rate of the embedding layer is multiplied by 3... Images are processed in batches of 128... The network was trained using Adam optimizer with a learning rate 10 4 for 100 epochs and a batch size of 50. |