reproducibilityindex.ai

Neural Bregman Divergences for Distance Learning

Authors: Fred Lu, Edward Raff, Francis Ferraro

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also demonstrate that our method more faithfully learns divergences over a set of both new and previously studied tasks, including asymmetric regression, ranking, and clustering. Our tests further extend to known asymmetric, but non-Bregman tasks, where our method still performs competitively despite misspeciﬁcation, showing the general utility of our approach for asymmetric learning.
Researcher Affiliation	Collaboration	University of Maryland, Baltimore County Booz Allen Hamilton
Pseudocode	Yes	Algorithm 1 Neural Bregman Divergence (NBD).
Open Source Code	No	The paper mentions adapting PyTorch code from another work ("We adapt their Py Torch code from https: //github.com/spitis/deepnorms.") but does not provide a link or explicit statement for the code of their proposed method (NBD).
Open Datasets	Yes	The dataset consists of paired MNIST images... We also make a harder version by substituting MNIST with CIFAR10... We use the INRIA Holidays dataset (see Appendix G).
Dataset Splits	Yes	A 50K/10K train-test split was used. The training set consists of 10, 000 pairs sampled with random crops each epoch from the first 200 of the images, while the test set is a fixed set of 10, 000 pairs with crops drawn from the last 100.
Hardware Specification	Yes	We used Quadro RTX 6000 GPUs to train our models.
Software Dependencies	No	The paper mentions "Py Torch API (Paszke et al., 2017)" but does not specify a version number for PyTorch itself or any other software libraries used.
Experiment Setup	Yes	We used batch size 128, 200 epochs, 1e-3 learning rate for all models. A typical example of the parameters is batch size 256, 250 epochs, learning rate 1e-3. We used 100 epochs of training with learning rate 1e-3, batch size 1000. We use default hyperparameter settings to keep methods comparable, such as Adam optimizer, learning rate 1e-3, batch size 128, embedding dimension 128, and 200 epochs.