reproducibilityindex.ai

Gradients as Features for Deep Representation Learning

Authors: Fangzhou Mu, Yingyu Liang, Yin Li

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our method is evaluated across a number of representation-learning tasks on several datasets and using different network architectures. Strong results are obtained in all settings, and are well-aligned with our theoretical insights. Our experimental results are organized into two parts. We ﬁrst perform ablation studies to understand the representation power of the gradient features. Next, we evaluate our method on three representation-learning tasks: learning deep generative models, self-supervised learning using a pretext task, and transfer learning from Image Net.
Researcher Affiliation	Academia	Fangzhou Mu, Yingyu Liang Department of Computer Sciences University of Wisconsin-Madison {fmu, yliang,}@cs.wisc.edu Yin Li Departments of Biostatistics & Computer Sciences University of Wisconsin-Madison yin.li@wisc.edu
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	Project webpage at http://pages.cs.wisc.edu/ fmu/gradfeat20. The project webpage states: "Code will be released soon".
Open Datasets	Yes	We train a Bi GAN on CIFAR-10 (Krizhevsky et al., 2009)... We use the Py Torch (Paszke et al., 2017) distribution of Image Net pre-trained Res Net18 (He et al., 2016) as the base network for VOC07 (Everingham et al., 2010) object classiﬁcation. SVHN CIFAR-10 CIFAR-100 VOC07 COCO2014
Dataset Splits	Yes	For the SVHN and CIFAR-10/100 experiments, We train the models for 80K iterations with initial learning rate 1e-3, halved every 20K iterations. For the VOC07 and COCO2014 experiments, we train the models for 50 epochs with initial learning rate 1e-3, halved every 20 epochs. We train on the trainval split of VOC07 and the train split of COCO2014 for object classiﬁcation, and report the mean average precision (m AP) scores on their respective test and val splits.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types) used for running its experiments.
Software Dependencies	No	We use the Py Torch (Paszke et al., 2017) distribution of Image Net pre-trained Res Net18... All models are trained with the Adam optimizer (Kingma & Ba, 2015). Although software names are mentioned and cited, specific version numbers for these software dependencies are not provided.
Experiment Setup	Yes	We train the models for 80K iterations with initial learning rate 1e-3, halved every 20K iterations. For the VOC07 and COCO2014 experiments, we train the models for 50 epochs with initial learning rate 1e-3, halved every 20 epochs. All models are trained with the Adam optimizer (Kingma & Ba, 2015) with batch size 64, β1 = 0.5, β2 = 0.999 and weight decay 1e-6. In addition to Adam, we also use the SGD optimizer with weight decay 5e-5, momentum 0.9 and the same learning rate schedule for ﬁne-tuning, and we report the better result between the two runs.