reproducibilityindex.ai

Representation Learning Beyond Linear Prediction Functions

Authors: Ziping Xu, Ambuj Tewari

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our theoretical results imply that simpler tasks generalize better. Though our theoretical results are shown for the global minimizer of empirical risks, their qualitative predictions still hold true for gradient-based optimization algorithms as veriﬁed by our simulations on deep neural networks. In this section, we use simulated environments to evaluate the actual performance of representation learning on DNNs trained with gradient-based optimization methods.
Researcher Affiliation	Academia	Ziping Xu Department of Statistics University of Michigan zipingxu@umich.edu Ambuj Tewari Department of Statistics University of Michigan tewaria@umich.edu
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code is in the supplemental material.
Open Datasets	No	The paper mentions 'Image Net' but does not provide concrete access information such as a link, DOI, repository, or a formal citation with authors and year for any dataset used.
Dataset Splits	No	The paper refers to 'nso random samples' and 'nta random samples' but does not provide specific percentages, counts, or predefined splits for training, validation, or test sets in the main text. It defers training details to supplementary materials.
Hardware Specification	No	The paper states that hardware details are in the supplementary materials but does not provide any specific hardware specifications (like GPU models or CPU types) in the main text.
Software Dependencies	No	The paper mentions using 'Adam with default parameters' but does not provide specific version numbers for Adam, Python, or any other software libraries or dependencies.
Experiment Setup	Yes	The ﬁrst K layers are the shared representation. The source task is a multi-variate regression problem with output dimension p and Kso layers following the representation. The target task is a single-output regression problem with Kta layers following the representation. We used the same number of units for all the layers, which we denote by nu. A representation is ﬁrst trained on the source task using nso random samples and is ﬁxed for the target task, trained on nta random samples. In contrast, the baseline method trains the target task directly on the same nta samples without the pretrained network. We use Adam with default parameters for all the training. We use MSE (Mean Square Error) to evaluate the performance under different settings.