Convex Representation Learning for Generalized Invariance in Semi-Inner-Product Space

Authors: Yingyi Ma, Vignesh Ganapathiraman, Yaoliang Yu, Xinhua Zhang

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments demonstrate that the new s.i.p.-based algorithm learns more predictive representations than strong baselines. Table 1: Test accuracy of minimizing empirical risk on binary classification tasks. Table 2: Test accuracy on mixup classification task based on 10 random runs. Table 3: Test accuracy on multilabel prediction with logic relationship
Researcher Affiliation Academia 1University of Illinois at Chicago 2University of Waterloo and Vector Institute. Correspondence to: Xinhua Zhang <zhangx@uic.edu>.
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks, nor does it have clearly labeled algorithm sections.
Open Source Code No The paper does not include an unambiguous statement that the authors are releasing code for the work described, nor does it provide a direct link to a source-code repository.
Open Datasets Yes We experimented with three image datasets: MNIST, USPS, and Fashion MNIST, each containing 10 classes. We conducted experiments on three multilabel datasets where additional information is available about the hierarchy in its class labels (link): Enron (Klimt and Yang, 2004), WIPO (Rousu et al., 2006), Reuters (Lewis et al., 2004). link. Multilabel dataset. https://sites.google. com/site/hrsvmproject/datasets-hier.
Dataset Splits No The paper specifies training and test set sizes (e.g., '1000 training and 1000 test examples', 'n examples for training and n examples for testing', '100/100, 200/200, 500/500 randomly drawn train/test examples') but does not explicitly mention a separate validation split or how validation was performed to set hyperparameters if any.
Hardware Specification No The paper mentions support from 'Google Cloud' but does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment.
Experiment Setup Yes To ease the computation of derivative, we resorted to finite difference for all methods, with two pixels for shifting, 10 degrees for rotation, and 0.1 unit for scaling. The λ was generated from a Beta distribution, whose parameter was tuned to optimize the performance. We also varied p in {n, 2n, 4n} when training Embed. each setting was evaluated 10 times with randomly sampled training and test data.