reproducibilityindex.ai

Is a Modular Architecture Enough?

Authors: Sarthak Mittal, Yoshua Bengio, Guillaume Lajoie

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work, we provide a thorough assessment of common modular architectures, through the lens of simple and known modular data distributions. We highlight the benefits of modularity and sparsity and reveal insights on the challenges faced while optimizing modular systems. In doing so, we propose evaluation metrics that highlight the benefits of modularity, the regimes in which these benefits are substantial, as well as the sub-optimality of current end-to-end learned modular systems as opposed to their claimed potential.
Researcher Affiliation	Academia	Sarthak Mittal , Yoshua Bengio, Guillaume Lajoie Mila, Université de Montréal
Pseudocode	No	The paper does not contain any pseudocode or algorithm blocks.
Open Source Code	Yes	Open-sourced implementation is available at https://github.com/sarthmit/Mod_Arch
Open Datasets	No	The paper uses synthetic data ('Since we aim to study modular systems through synthetic data, here we flesh out the data-generating processes...'), which is generated on the fly ('infinite-data regime where each training iteration operates on a new data sample'), and thus no public dataset or its access information is provided.
Dataset Splits	No	The paper states it operates in an 'infinite-data regime where each training iteration operates on a new data sample', meaning there are no fixed train/validation/test splits of a finite dataset provided for reproduction.
Hardware Specification	Yes	All models are trained on single V100 GPUs, each taking a few hours.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers.
Experiment Setup	No	The paper mentions the scope of experiments (number of rules, model capacities, training settings) and refers to appendices for 'training details', but does not include specific hyperparameter values or detailed training configurations in the main text.