Convex Representation Learning for Generalized Invariance in Semi-Inner-Product Space
Authors: Yingyi Ma, Vignesh Ganapathiraman, Yaoliang Yu, Xinhua Zhang
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate that the new s.i.p.-based algorithm learns more predictive representations than strong baselines. Table 1: Test accuracy of minimizing empirical risk on binary classification tasks. Table 2: Test accuracy on mixup classification task based on 10 random runs. Table 3: Test accuracy on multilabel prediction with logic relationship |
| Researcher Affiliation | Academia | 1University of Illinois at Chicago 2University of Waterloo and Vector Institute. Correspondence to: Xinhua Zhang <zhangx@uic.edu>. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks, nor does it have clearly labeled algorithm sections. |
| Open Source Code | No | The paper does not include an unambiguous statement that the authors are releasing code for the work described, nor does it provide a direct link to a source-code repository. |
| Open Datasets | Yes | We experimented with three image datasets: MNIST, USPS, and Fashion MNIST, each containing 10 classes. We conducted experiments on three multilabel datasets where additional information is available about the hierarchy in its class labels (link): Enron (Klimt and Yang, 2004), WIPO (Rousu et al., 2006), Reuters (Lewis et al., 2004). link. Multilabel dataset. https://sites.google. com/site/hrsvmproject/datasets-hier. |
| Dataset Splits | No | The paper specifies training and test set sizes (e.g., '1000 training and 1000 test examples', 'n examples for training and n examples for testing', '100/100, 200/200, 500/500 randomly drawn train/test examples') but does not explicitly mention a separate validation split or how validation was performed to set hyperparameters if any. |
| Hardware Specification | No | The paper mentions support from 'Google Cloud' but does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment. |
| Experiment Setup | Yes | To ease the computation of derivative, we resorted to finite difference for all methods, with two pixels for shifting, 10 degrees for rotation, and 0.1 unit for scaling. The λ was generated from a Beta distribution, whose parameter was tuned to optimize the performance. We also varied p in {n, 2n, 4n} when training Embed. each setting was evaluated 10 times with randomly sampled training and test data. |