Neural Bregman Divergences for Distance Learning
Authors: Fred Lu, Edward Raff, Francis Ferraro
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also demonstrate that our method more faithfully learns divergences over a set of both new and previously studied tasks, including asymmetric regression, ranking, and clustering. Our tests further extend to known asymmetric, but non-Bregman tasks, where our method still performs competitively despite misspeciļ¬cation, showing the general utility of our approach for asymmetric learning. |
| Researcher Affiliation | Collaboration | University of Maryland, Baltimore County Booz Allen Hamilton |
| Pseudocode | Yes | Algorithm 1 Neural Bregman Divergence (NBD). |
| Open Source Code | No | The paper mentions adapting PyTorch code from another work ("We adapt their Py Torch code from https: //github.com/spitis/deepnorms.") but does not provide a link or explicit statement for the code of their proposed method (NBD). |
| Open Datasets | Yes | The dataset consists of paired MNIST images... We also make a harder version by substituting MNIST with CIFAR10... We use the INRIA Holidays dataset (see Appendix G). |
| Dataset Splits | Yes | A 50K/10K train-test split was used. The training set consists of 10, 000 pairs sampled with random crops each epoch from the first 200 of the images, while the test set is a fixed set of 10, 000 pairs with crops drawn from the last 100. |
| Hardware Specification | Yes | We used Quadro RTX 6000 GPUs to train our models. |
| Software Dependencies | No | The paper mentions "Py Torch API (Paszke et al., 2017)" but does not specify a version number for PyTorch itself or any other software libraries used. |
| Experiment Setup | Yes | We used batch size 128, 200 epochs, 1e-3 learning rate for all models. A typical example of the parameters is batch size 256, 250 epochs, learning rate 1e-3. We used 100 epochs of training with learning rate 1e-3, batch size 1000. We use default hyperparameter settings to keep methods comparable, such as Adam optimizer, learning rate 1e-3, batch size 128, embedding dimension 128, and 200 epochs. |