reproducibilityindex.ai

Transfer Learning via $\ell_1$ Regularization

Authors: Masaaki Takada, Hironori Fujisawa

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results demonstrate that the proposed method effectively balances stability and plasticity.
Researcher Affiliation	Collaboration	Masaaki Takada Toshiba Corporation Tokyo 105-0023, Japan masaaki1.takada@toshiba.co.jp Hironori Fujisawa The Institute of Statistical Mathematics Tokyo 190-8562, Japan fujisawa@ism.ac.jp
Pseudocode	Yes	We provide a coordinate descent algorithm for Transfer Lasso. It is guaranteed to converge to a global optimal solution [36], because the problem is convex and the penalty is separable. Let β be the current value. Consider a new value βj as a minimizer of L(β; β) when other elements of β except for βj are ﬁxed. We have βj L(β; β) = 1 n X j (y X jβ j) + βj + λα sgn(βj) + λ(1 α) sgn(βj βj) = 0, where Xj and X j denote the j-th column of X and X without j-th column, respectively, and sgn( ) denotes the sign function, hence we obtain the update rule as T (z, γ1, γ2, b) := 0 for γ1 z γ2 \| γ2 z γ1 b for γ2 + b z γ1 + b \| γ1 + b z γ2 + b z γ2 sgn(b) for γ2 z γ2 + b \| γ2 + b z γ2 z γ1 sgn(z) for otherwise \| otherwise.
Open Source Code	No	The paper does not provide any links or explicit statements about the availability of open-source code for the described methodology.
Open Datasets	Yes	The newsgroup message data1 comprises messages from Usenet posts on different topics. We basically followed the concept drift experiments in [18] and used preprocessed data2. 1https://kdd.ics.uci.edu/databases/20newsgroups/20newsgroups.html 2http://lpis.csd.auth.gr/mlkd/concept_drift.html
Dataset Splits	Yes	The regularization parameters λ and α were determined by ten-fold cross validation. The examples were divided into 30 batches without changing the order of the samples, each containing 50 examples. We trained models using each batch and evaluated them using the next batch.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running the experiments.
Software Dependencies	No	The paper does not specify any software libraries or their version numbers used in the experiments.
Experiment Setup	Yes	The regularization parameters λ and α were determined by ten-fold cross validation. The parameter λ was selected by a decreasing sequence from λmax to λmax 10 4 in log-scale, where λmax was calculated as in Section 3.2. The parameter α was selected among {0, 0.25, 0.5, 0.75, 1}. Each dataset was centered and standardized such that y = 0, Xj = 0 and sd(Xj) = 1 in preprocessing.