Tied-Augment: Controlling Representation Similarity Improves Data Augmentation
Authors: Emirhan Kurtuluş, Zichao Li, Yann Dauphin, Ekin Dogus Cubuk
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To show the effectiveness of Tied-Augment, we experiment with training from-scratch on CIFAR-10, CIFAR-100, and Image Net. We extend these tests with finetuning and few-epoch / low-data regimes to simulate more realistic scenarios, where the amount of domain-specific data or available compute is limited. Lastly, we show that Tied-Augment significantly improves the performance of state-of-the-art methods (e.g. mixup and SAM) and can be used for semisupervised learning (e.g. Fix Match). |
| Researcher Affiliation | Collaboration | 1Stanford University 2Cagaloglu Anadolu Lisesi 3Google Research, Brain Team. Correspondence to: Emirhan Kurtulus <emirhank@stanford.edu>, Ekin D. Cubuk <cubuk@google.com>. |
| Pseudocode | Yes | Figure 1. Python code for Tied-Augment based on Num Py. |
| Open Source Code | Yes | We open source our code at https://github.com/ ekurtulus/tied-augment/tree/main |
| Open Datasets | Yes | To show the effectiveness of Tied-Augment, we experiment with training from-scratch on CIFAR-10, CIFAR-100, and Image Net. We extend these tests with finetuning and few-epoch / low-data regimes to simulate more realistic scenarios, where the amount of domain-specific data or available compute is limited. |
| Dataset Splits | Yes | For runs with epoch={1, 2, 5}, the learning rate and weight-decay were tuned to maximize the validation accuracy of the identity baseline (since in this regime identity baseline outperforms the Crop-Flip baseline). The learning rate and weight-decay hyperparameters for the 10 epoch models were tuned to maximize the validation set performance of the Crop-Flip baseline. |
| Hardware Specification | Yes | The numerical calculations reported in this paper were partially performed at TUBITAK ULAKBIM, High Performance and Grid Computing Center (TRUBA resources). ... we empirically observe an increase of roughly 30% increase on an Nvidia A100 on CIFAR-10. |
| Software Dependencies | No | Figure 1 mentions "Python code for Tied-Augment based on Num Py", but specific version numbers for Python or NumPy are not provided, nor are any other software dependencies with versions. |
| Experiment Setup | Yes | All Image Net models use a learning rate of 0.4 with a batch size of 1024, weight-decay rate of 1e-4. The Tied Rand Augment model that was trained for 90 epochs used Crop-Flip on first branch, and Rand Augment(N=2, M=9) on the other branch, with a Tied-weight of 4. The Tied Rand Augment Res Net-50 model that was trained for 360 epochs used Rand Augment(N=2, M=13) for the first branch and Rand Augment(N=2, M=9, P=0.5) for the second branch, with a Tied-weight of 12.0. The Tied-Rand Augment Res Net-200 model used Rand Augment(N=2, M=13) for both branches with a Tied-weight of 12.0. All Tied-Augment Image Net models trained for 90 epochs used a Tied-weight of 4, and models trained for longer used a Tied-weight of 12. The optimal Tied-weight for Tiedmixup on Imagenet was 50. |