reproducibilityindex.ai

Composed Fine-Tuning: Freezing Pre-Trained Denoising Autoencoders for Improved Generalization

Authors: Sang Michael Xie, Tengyu Ma, Percy Liang

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We prove for two-layer Re LU networks thatcomposedﬁne-tuningsigniﬁcantlyreducesthe complexity of the predictor, thus improving generalization. Empirically, we show that composed ﬁne-tuning improves over standard ﬁne-tuning on two pseudocode-to-code translation datasets (3% and 6% relative).
Researcher Affiliation	Academia	1Department of Computer Science, Stanford University. Correspondence to: Sang Michael Xie <xie@cs.stanford.edu>.
Pseudocode	No	The paper describes the proposed method and objective function mathematically but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	All data and code for reproducing the experiments are on our Coda Lab worksheet and Git Hub repository.
Open Datasets	Yes	We evaluate composed ﬁne-tuning on two pseudocode-to-code datasets, SANSTYPE and SPOC (Kulal et al., 2019)... All data and code for reproducing the experiments are on our Coda Lab worksheet and Git Hub repository.
Dataset Splits	Yes	Out of the 6200 labeled examples (62 characters 100 fonts), we split randomly into 2500 training examples, 100 validation examples, and 3600 test examples.
Hardware Specification	No	The paper does not explicitly mention specific hardware details such as GPU or CPU models, memory, or specific cloud computing instances used for running the experiments.
Software Dependencies	No	The paper mentions using 'Transformers' as a model architecture, but it does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, or specific library versions).
Experiment Setup	Yes	In all models, we use weight decay, dropout, attention dropout, and Re LU dropout as regularization and use λ = 1 to balance between the ﬁtting the composed and direct objectives. During inference, we use greedy decoding for simplicity...