MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
Authors: Jeong Un Ryu, JaeWoong Shin, Hae Beom Lee, Sung Ju Hwang
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate the efficacy and generality of Meta Perturb trained on a specific source domain and architecture, by applying it to the training of diverse neural architectures on heterogeneous target datasets against various regularizers and fine-tuning. The results show that the networks trained with Meta Perturb significantly outperform the baselines on most of the tasks and architectures, with a negligible increase in the parameter size and no hyperparameters to tune. |
| Researcher Affiliation | Collaboration | Jeongun Ryu1 Jaewoong Shin1 Hae Beom Lee1 Sung Ju Hwang 1,2 1KAIST, 2AITRICS, South Korea |
| Pseudocode | Yes | Algorithm 1 Meta-training |
| Open Source Code | No | The paper does not explicitly state that open-source code for the methodology is provided nor does it include a link to a code repository. |
| Open Datasets | Yes | We use Tiny Image Net [1] as the source dataset, which is a subset of the Image Net [33] dataset. ... We then transfer our perturbation function to the following target tasks: STL10 [7], CIFAR-100 [18], Stanford Dogs [16], Stanford Cars [17], Aircraft [25], and CUB [44]. |
| Dataset Splits | Yes | We class-wisely split the dataset into 10 splits to produce heterogeneous task samples. ... Thus we select the best performing noise generator over five meta-training runs using a validation set consisting of samples from CIFAR-100, that is disjoint from s-CIFAR100, and use it throughout all the experiments in the paper. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments, only a general mention of 'a single GPU'. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers. |
| Experiment Setup | Yes | For the base regularizations, we used the weight decay of 0.0005 and random cropping and horizontal flipping in all experiments. |