reproducibilityindex.ai

Convex Deep Learning via Normalized Kernels

Authors: Özlem Aslan, Xinhua Zhang, Dale Schuurmans

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To investigate the potential of deep versus shallow convex training methods, and global versus local training methods, we implemented the approach outlined above for a three-layer model along with comparison methods.
Researcher Affiliation	Academia	Ozlem Aslan Dept of Computing Science University of Alberta, Canada ozlem@cs.ualberta.ca Xinhua Zhang Machine Learning Group NICTA and ANU xizhang@nicta.com.au Dale Schuurmans Dept of Computing Science University of Alberta, Canada dale@cs.ualberta.ca
Pseudocode	Yes	Algorithm 1: Conditional gradient algorithm to optimize f(M1, M2) for M1, M2 M.
Open Source Code	No	The paper does not contain any explicit statement about making the source code for the described methodology publicly available, nor does it provide a link to a code repository.
Open Datasets	Yes	Here we tried to replicate the results of [25] on similar data sets, USPS and COIL from [41], Letter from [42], MNIST, and CIFAR-100 from [43].
Dataset Splits	Yes	a given set of data (X, Y) is divided into separate training and test sets, (XL, YL) and XU, where labels are only included for the training set.
Hardware Specification	No	No specific hardware details (such as GPU/CPU models, memory, or cloud instance types) used for running experiments are mentioned in the paper.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies or libraries used in the implementation.
Experiment Setup	Yes	This loss can be naturally interpreted using the remark following Postulate 1. It encourages that the propensity of example j with respect to itself, Sjj, should be higher than its propensity with respect to other examples, Sij, by a margin that is deﬁned through the normalized kernel M. However note this loss does not correspond to a linear transfer between layers, even in terms of the propensity matrix S or normalized output kernel M. As in all large margin methods, the initial loss (12) is a convex upper bound for an underlying discrete loss deﬁned with respect to a step transfer.