reproducibilityindex.ai

A Combinatorial Perspective on Transfer Learning

Authors: Jianan Wang, Eren Sezener, David Budden, Marcus Hutter, Joel Veness

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We now explore the properties of the NCTL algorithm empirically. We present our analysis in three parts: in Section 5.1, we demonstrate that NCTL exhibits combinatorial transfer using a more challenging variant of the standard Split MNIST protocol; in Section 5.2, we compare the performance of NCTL to many previous continual learning algorithms across standard Permuted and Split MNIST variants, using the same test and train splits as previously published; in Section 5.3, we further evaluate NCTL on a widely used real-world dataset Electricity (Elec2-3) which exhibits temporal dependencies and distribution drift.
Researcher Affiliation	Industry	Deep Mind aixi@google.com
Pseudocode	No	The paper describes the algorithm in prose within Section 4 'Algorithm' but does not provide a formal pseudocode or structured algorithm block within the document.
Open Source Code	Yes	Code at: github.com/deepmind/deepmind-research/.
Open Datasets	Yes	Most recent studies on continual learning deﬁne a protocol that makes use of an underlying MNIST (or similar) classiﬁcation dataset. The popular (Disjoint) Split MNIST [ZPG17] involves separating the 10-class classiﬁcation problem with 5 binary classiﬁcation tasks... The Electricity (Elec2-3) dataset [HW99] contains 45,312 instances collected from the Australian NSW Electricity Market between May 1997 and December 1999.
Dataset Splits	Yes	We compare the performance of NCTL to many previous continual learning algorithms across standard Permuted and Split MNIST variants, using the same test and train splits as previously published
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running experiments were mentioned.
Software Dependencies	No	NCTL was implemented using JAX [BFH+18] and the Deep Mind JAX ecosystem [BHK+20, HCNB20, HBV+20, BHQ+20]. However, specific version numbers for JAX or other libraries are not provided.
Experiment Setup	Yes	Hyper-parameters are optimized by grid search for both EWC and online EWC: the regularization constant λ is set to 106 and the learning rate is set to 10 5 for EWC; and we have λ = 107, a learning rate of 10 5, and the Fisher information matrix leak term γ (based on the formalism of [SLC+18]) set to 0.8 for online EWC. Our NCTL consisted of 50-25-1 neurons where the base model for each neuron is a GGM with context space C = 24 trained with learning rate 0.001. We adopted the same hyperparameters for the GLN baseline.