Sequence Modeling with Multiresolution Convolutional Memory
Authors: Jiaxin Shi, Ke Alexander Wang, Emily Fox
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical evaluation covers sequential image classification and autoregressive generative modeling (CIFAR-10), reasoning on syntax trees (List Ops), and multi-label classification of electrocardiogram (PTB-XL). |
| Researcher Affiliation | Academia | Jiaxin Shi 1 Ke Alexander Wang 1 Emily B. Fox 1 2 1Stanford University 2CZ Biohub SF. |
| Pseudocode | Yes | We provide an example Py Torch implementation of a MULTIRESLAYER with the resolution fading Tree Select strategy (see Sec. 3.2) in Fig. 3. |
| Open Source Code | Yes | Py Torch code can be found in https://github.com/thjashin/multires-conv. |
| Open Datasets | Yes | We evaluate our model on the Sequential CIFAR-10 dataset, which has long been used as a standard benchmark for modeling long-range dependencies in RNNs. [...] long List Ops dataset from Tay et al. (2021). [...] PTB-XL (Wagner et al., 2020) is a publicly available dataset of electrocardiogram (ECG) time series. |
| Dataset Splits | Yes | We use the standard train and test split of the CIFAR-10 dataset and leave out 10% of the training set as the validation set. |
| Hardware Specification | No | The paper mentions 'hardware accelerators implementing convolutions' but does not specify any particular GPU models, CPU models, or other hardware details used for the experiments. |
| Software Dependencies | No | The paper refers to 'Py Torch code' but does not specify exact version numbers for PyTorch, Python, CUDA, or other relevant libraries. |
| Experiment Setup | Yes | We use the Adam optimizer with default hyperparameters and decoupled weighted decay (Loshchilov & Hutter, 2018). Dropout is applied after the GELU and gated linear activation functions whenever overfitting is observed. |