Probabilistic Numeric Convolutional Neural Networks

Authors: Marc Anton Finzi, Roberto Bondesan, Max Welling

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In experiments we show that our approach yields a 3 reduction of error from the previous state of the art on the Super Pixel-MNIST dataset and competitive performance on the medical time series dataset Physio Net2012.
Researcher Affiliation Collaboration Marc Finzi Qualcomm AI Research New York University maf820@nyu.edu Roberto Bondesan & Max Welling Qualcomm AI Research {rbondesa, mwelling}@qti.qualcomm.com
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets Yes We source the Super Pixel MNIST dataset (Monti et al., 2017) from Fey and Lenssen (2019) consisting of 60k training examples and 10k test represented as collections of positions and grayscale values {(xi, f(xi))}75 i=1 at the N = 75 super pixel centroids. For the second task, we evaluate our model on the irregularly spaced time series dataset Physio Net2012 (Silva et al., 2012) for predicting mortality from ICU vitals signs.
Dataset Splits Yes We source the Super Pixel MNIST dataset (Monti et al., 2017) from Fey and Lenssen (2019) consisting of 60k training examples and 10k test represented as collections of positions and grayscale values... We follow the data preprocessing from Horn et al. (2019) and the 10k-2k train test split. For both datasets we tuned hyperparameters on a validation set of size 10% before folding the validation set back into the training set for the final runs.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments. It only mentions that "Both models take about 2 hours to train."
Software Dependencies No The paper mentions using the Adam optimizer but does not specify software dependencies with version numbers (e.g., Python, PyTorch, or specific library versions).
Experiment Setup Yes For the PNCNN on the Superpixel MNIST dataset, we use 4 PNCNN convolution blocks with c = 128 channels and with K = 9 basis elements for the different drift and diffusion parameters in PK k=1 Wke Dk. We train for 20 epochs using the Adam optimizer (Kingma and Ba, 2014) with lr = 310 3 with batch size 50. For the PNCNN on the Physio Net2012 dataset, we use the variant of the PNCNN convolution layer that uses the stochastic diagonal estimator described in appendix G with P = 20 probes. In the convolution blocks we use c = 96 channels, K = 5 basis elements and we train for 10 epochs using the same optimizer settings above.