Modular Universal Reparameterization: Deep Multi-task Learning Across Diverse Domains
Authors: Elliot Meyerson, Risto Miikkulainen
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The approach is first validated in a classic synthetic multi-task learning benchmark, and then applied to sharing across disparate architectures for vision, NLP, and genomics tasks. It discovers regularities across these domains, encodes them into sharable modules, and combines these modules systematically to improve performance in the individual tasks. The results confirm that sharing learned functionality across diverse domains and architectures is indeed beneficial, thus establishing a key ingredient for general problem solving in the future. |
| Researcher Affiliation | Collaboration | Elliot Meyerson Cognizant elliot.meyerson@cognizant.com Risto Miikkulainen1,2 The University of Texas at Austin2 risto@cs.utexas.edu |
| Pseudocode | Yes | Algorithm 1 Decomposed K-valued (1 + λ)-EA |
| Open Source Code | No | The paper does not include an unambiguous statement that the authors are releasing source code for the described methodology, nor does it provide a direct link to a code repository. |
| Open Datasets | Yes | The first task is CIFAR-10, the classic image classification benchmark of 60K images [26]. The second task is Wiki Text-2 language modeling benchmark with over 2M tokens [36]. The third task is CRISPR binding prediction, where the goal is to predict the propensity of a CRISPR protein complex to bind to (and cut) unintended locations in the genome [21]. |
| Dataset Splits | Yes | Unlike in previous work, five training samples for each task were withheld as validation data, making the setup more difficult. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running the experiments. |
| Software Dependencies | No | The paper mentions "Pytorch" in the references, but it does not specify a version number for PyTorch or any other software dependencies used in the experiments. |
| Experiment Setup | Yes | All models are trained with Adam [24] for optimization with default parameters and a learning rate of 10−4. Batch size is 32. Early stopping is used based on validation performance, with a patience of 20 epochs. |