Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Tied-Augment: Controlling Representation Similarity Improves Data Augmentation
Authors: Emirhan Kurtuluş, Zichao Li, Yann Dauphin, Ekin Dogus Cubuk
ICML 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To show the effectiveness of Tied-Augment, we experiment with training from-scratch on CIFAR-10, CIFAR-100, and Image Net. We extend these tests with finetuning and few-epoch / low-data regimes to simulate more realistic scenarios, where the amount of domain-specific data or available compute is limited. Lastly, we show that Tied-Augment significantly improves the performance of state-of-the-art methods (e.g. mixup and SAM) and can be used for semisupervised learning (e.g. Fix Match). |
| Researcher Affiliation | Collaboration | 1Stanford University 2Cagaloglu Anadolu Lisesi 3Google Research, Brain Team. Correspondence to: Emirhan Kurtulus <EMAIL>, Ekin D. Cubuk <EMAIL>. |
| Pseudocode | Yes | Figure 1. Python code for Tied-Augment based on Num Py. |
| Open Source Code | Yes | We open source our code at https://github.com/ ekurtulus/tied-augment/tree/main |
| Open Datasets | Yes | To show the effectiveness of Tied-Augment, we experiment with training from-scratch on CIFAR-10, CIFAR-100, and Image Net. We extend these tests with finetuning and few-epoch / low-data regimes to simulate more realistic scenarios, where the amount of domain-specific data or available compute is limited. |
| Dataset Splits | Yes | For runs with epoch={1, 2, 5}, the learning rate and weight-decay were tuned to maximize the validation accuracy of the identity baseline (since in this regime identity baseline outperforms the Crop-Flip baseline). The learning rate and weight-decay hyperparameters for the 10 epoch models were tuned to maximize the validation set performance of the Crop-Flip baseline. |
| Hardware Specification | Yes | The numerical calculations reported in this paper were partially performed at TUBITAK ULAKBIM, High Performance and Grid Computing Center (TRUBA resources). ... we empirically observe an increase of roughly 30% increase on an Nvidia A100 on CIFAR-10. |
| Software Dependencies | No | Figure 1 mentions "Python code for Tied-Augment based on Num Py", but specific version numbers for Python or NumPy are not provided, nor are any other software dependencies with versions. |
| Experiment Setup | Yes | All Image Net models use a learning rate of 0.4 with a batch size of 1024, weight-decay rate of 1e-4. The Tied Rand Augment model that was trained for 90 epochs used Crop-Flip on first branch, and Rand Augment(N=2, M=9) on the other branch, with a Tied-weight of 4. The Tied Rand Augment Res Net-50 model that was trained for 360 epochs used Rand Augment(N=2, M=13) for the first branch and Rand Augment(N=2, M=9, P=0.5) for the second branch, with a Tied-weight of 12.0. The Tied-Rand Augment Res Net-200 model used Rand Augment(N=2, M=13) for both branches with a Tied-weight of 12.0. All Tied-Augment Image Net models trained for 90 epochs used a Tied-weight of 4, and models trained for longer used a Tied-weight of 12. The optimal Tied-weight for Tiedmixup on Imagenet was 50. |