Rethinking Neural Operations for Diverse Tasks

Authors: Nicholas Roberts, Mikhail Khodak, Tri Dao, Liam Li, Christopher Ré, Ameet Talwalkar

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On a diverse set of tasks solving PDEs, distance prediction for protein folding, and music modeling our approach consistently yields models with lower error than baseline networks and often even lower error than expert-designed domain-specific approaches.
Researcher Affiliation Collaboration Nicholas Roberts University of Wisconsin-Madison nick11roberts@cs.wisc.edu Mikhail Khodak Carnegie Mellon University khodak@cmu.edu Tri Dao Stanford University trid@stanford.edu Liam Li Hewlett Packard Enterprise me@liamcli.com Christopher Ré Stanford University chrismre@cs.wisc.edu Ameet Talwalkar Carnegie Mellon University & Hewlett Packard Enterprise talwalkar@cmu.edu
Pseudocode No The paper describes procedures and algorithms in natural language but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes Code to reproduce these results is available here: https://github.com/nick11roberts/XD. Software to apply XD-operations can be found here: https://github.com/mkhodak/relax.
Open Datasets Yes We work with the PDNET benchmark, which consists of a training set of 3,356 proteins, a validation set of 100 of proteins, and the PSICOV [18] test set of 150 proteins. ... The tasks we study are on the JSB Chorales and Nottingham corpora, used in the original evaluation of TCNs [5].
Dataset Splits Yes We work with the PDNET benchmark, which consists of a training set of 3,356 proteins, a validation set of 100 of proteins, and the PSICOV [18] test set of 150 proteins.
Hardware Specification No The paper mentions 'Cost (hours)' in Table 1 but does not specify any particular GPU, CPU, or other hardware used for the experiments. It directs to an appendix for this information which is not provided.
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., 'PyTorch 1.9' or 'Python 3.8').
Experiment Setup Yes We tune step-size, momentum, and the number of warmup epochs: initial epochs during which only model weights wu,v are updated. ... At all dimensions we use XD-operations of depth d = 13; in addition, in dimensions N > 1 we fix the architecture biases b and channel gates C to 0 and 1, respectively, to conserve memory at higher resolutions.