Modular Universal Reparameterization: Deep Multi-task Learning Across Diverse Domains

Authors: Elliot Meyerson, Risto Miikkulainen

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The approach is first validated in a classic synthetic multi-task learning benchmark, and then applied to sharing across disparate architectures for vision, NLP, and genomics tasks. It discovers regularities across these domains, encodes them into sharable modules, and combines these modules systematically to improve performance in the individual tasks. The results confirm that sharing learned functionality across diverse domains and architectures is indeed beneficial, thus establishing a key ingredient for general problem solving in the future.
Researcher Affiliation Collaboration Elliot Meyerson Cognizant elliot.meyerson@cognizant.com Risto Miikkulainen1,2 The University of Texas at Austin2 risto@cs.utexas.edu
Pseudocode Yes Algorithm 1 Decomposed K-valued (1 + λ)-EA
Open Source Code No The paper does not include an unambiguous statement that the authors are releasing source code for the described methodology, nor does it provide a direct link to a code repository.
Open Datasets Yes The first task is CIFAR-10, the classic image classification benchmark of 60K images [26]. The second task is Wiki Text-2 language modeling benchmark with over 2M tokens [36]. The third task is CRISPR binding prediction, where the goal is to predict the propensity of a CRISPR protein complex to bind to (and cut) unintended locations in the genome [21].
Dataset Splits Yes Unlike in previous work, five training samples for each task were withheld as validation data, making the setup more difficult.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running the experiments.
Software Dependencies No The paper mentions "Pytorch" in the references, but it does not specify a version number for PyTorch or any other software dependencies used in the experiments.
Experiment Setup Yes All models are trained with Adam [24] for optimization with default parameters and a learning rate of 10−4. Batch size is 32. Early stopping is used based on validation performance, with a patience of 20 epochs.