reproducibilityindex.ai

Co-Tuning for Transfer Learning

Authors: Kaichao You, Zhi Kou, Mingsheng Long, Jianmin Wang

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	A simple instantiation of the framework shows strong empirical results in four visual classiﬁcation tasks and one NLP classiﬁcation task, bringing up to 20% relative improvement.
Researcher Affiliation	Academia	Kaichao You, Zhi Kou, Mingsheng Long (B), Jianmin Wang School of Software, BNRist, Research Center for Big Data, Tsinghua University, China {ykc20,kz19}@mails.tsinghua.edu.cn, {mingsheng,jimwang}@tsinghua.edu.cn
Pseudocode	Yes	Algorithm 1 Category relationship learning (the reverse approach), Algorithm 2 Neural network calibration
Open Source Code	Yes	Code is available at https://github.com/thuml/Co Tuning.
Open Datasets	Yes	In computer vision, we have models pre-trained on the Image Net (Deng et al., 2009) classiﬁcation task... For medium-scale classiﬁcation tasks, we use CUB-200-2011 (Welinder et al., 2010), Stanford Cars (Krause et al., 2013), and FGVC Aircraft (Maji et al., 2013) datasets. ...The large-scale dataset is constructed from COCO object detection task in 2017. ... We experiment with English named entity recognition (NER) task in Co NLL 2003 (Sang & De Meulder, 2003).
Dataset Splits	Yes	For datasets without validation splits, 20% training data are used for validation (split once and then the validation set is ﬁxed) and the rest 80% training data are used for training. This way, each dataset has a train/val/test split.
Hardware Specification	No	The paper mentions '10K GPU hours' in the context of hyperparameter tuning for other methods, but does not specify the exact GPU models, CPU models, or other hardware specifications used for their own experiments.
Software Dependencies	No	The paper mentions using 'Py Torch' and 'scikit-learn' for implementation but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	The learning rate for randomly initialized parameters is ten times of the learning rate for pre-trained parameters... Hyper-parameters of Co-Tuning and compared methods are selected by the performance on target validation data... all models are optimized by SGD with 0.9 momentum. Each experiment is repeated three times with different random seeds to collect mean and standard deviation of the performance.