Learning representations for binary-classification without backpropagation

Authors: Mathias Lechner

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we experimentally evaluate the learning performance of m DFA on a set of empirical benchmarks. We aim to answer the following two questions: How well does m DFA perform compared to DFA, FA, and backpropagation in natural conditions, i.e., in binary classification tasks, and how much does the performance of m DFA degrade in multi-class classification tasks?
Researcher Affiliation Academia Mathias Lechner IST Austria Am Campus 1, Klosterneuburg, Austria mlechner@ist.ac.at
Pseudocode No No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code Yes We make an efficient Tensor Flow implementation of all tested algorithms publicly available1
Open Datasets Yes We created a series of binary classification benchmarks by randomly sampling two classes from the well-studied CIFAR-100 and Image Net datasets.
Dataset Splits Yes Secondly, for each method, we tuned the hyperparameters on a separate validation set and selected the best performing configuration to be evaluated on the test data.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory amounts, or cloud instances) were mentioned for the experiments.
Software Dependencies No We make an efficient Tensor Flow implementation of all tested algorithms publicly available1 (Explanation: While TensorFlow is mentioned, no specific version number or other software dependencies with versions are provided.)
Experiment Setup Yes For all training methods, we fixed the batch size to 64, applied no regularization, no normalization, and no data augmentation. Optimizer, i.e., { Vanilla Gradient Descent, Adam (Kingma & Ba, 2014), Rmsprop (Tieleman & Hinton, 2012) }, learning rate, training epochs, and weight initialization method were tuned on the validation set. We tested three different weight initialization schemes; all zeros, a scaled uniform, and a normal distribution.