reproducibilityindex.ai

Deep Divergence Learning

Authors: Hatice Kubra Cilingir, Rachel Manzelli, Brian Kulis

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In all three of the above settings, we show empirical results that highlight the beneﬁts of our framework. In particular, we show that learning asymmetric divergences offers perfor mance gains over existing symmetric models on benchmark data, and achieve state-of-the-art classiﬁcation performance in some settings.
Researcher Affiliation	Academia	Kubra Cilingir 1 Rachel Manzelli 1 Brian Kulis 1 1Department of Electrical and Computer Engineering, Boston University, Boston, Massachusetts, USA.
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is available at https://github.com/kubrac/Deep_Bregman.
Open Datasets	Yes	We generated n = 500 training points... We use WHARF (Bruno et al., 2014), MHEALTH (Banos et al., 2014; 2015), and WISDM (Weiss et al., 2019) datasets in our initial experiments. ... on the four benchmark datasets used in the original triplet loss paper (Hoffer & Ailon, 2015) MNIST, Cifar10, SVHN, and STL10 as well as Fashion MNIST. ... We apply our approach on 28x28 MNIST and CELEBA datasets, as is standard for GAN applications.
Dataset Splits	Yes	We treat several architecture choices as hyperparameters and validate over these hyperparame ters using Bayesian optimization (tuned separately for each dataset); Table 5 lists the hyperparameters that we search over, along with the ranges of values considered.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper does not provide specific software dependency details with version numbers.
Experiment Setup	Yes	The number of units in each layer were set to 1000, 500, and 2, and standard Re LU activation was used. ... We treat several architecture choices as hyperparameters and validate over these hyperparame ters using Bayesian optimization... Model hyperparams: layers 2-5, conv ﬁlters 16-128, conv kernels 3-9... Training hyperparams: margin 0.1-2.0, epochs 10-40, learning rate 10^-5-10^-1, batch size 32-128, optimizer adam / sgd / rms...