reproducibilityindex.ai

Memory Efficient Neural Processes via Constant Memory Attention Block

Authors: Leo Feng, Frederick Tung, Hossein Hajimirsadeghi, Yoshua Bengio, Mohamed Osama Ahmed

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we show CMANPs achieve state-of-the-art results on popular NP benchmarks while being significantly more memory efficient than prior methods.
Researcher Affiliation	Collaboration	1Mila Universit e de Montr eal, Canada 2Borealis AI, Canada.
Pseudocode	No	No explicit pseudocode or algorithm blocks were found in the paper.
Open Source Code	Yes	Code: https://github.com/Borealis AI/constant-memory-anp.
Open Datasets	Yes	EMNIST (Cohen et al., 2017) comprises black and white images of handwritten letters of 32 32 resolution.
Dataset Splits	No	For each task, a randomly selected set of pixels are selected as context data points and target data points. N is a fixed number of context data points and M is a fixed number of target data points. The model is adapted using the context dataset. Afterwards, the target dataset is used to evaluate the effectiveness of the adaptation and adjust the adaptation rule accordingly. While the paper describes how context and target data points are sampled for each task, it does not provide explicit train/validation/test dataset splits for the overall datasets used.
Hardware Specification	Yes	All experiments were run on a Nvidia GTX 1080 Ti (12 GB) or Nvidia Tesla P100 (16 GB) GPU.
Software Dependencies	No	The paper mentions using implementations from official repositories of TNPs and LBANPs and cholesky decomposition, but does not provide specific version numbers for any software dependencies like Python, PyTorch, or other libraries.
Experiment Setup	Yes	For consistency, we set the number of latents (i.e., bottleneck size) \|LI\| = \|LB\| = 128 across all experiments. We also set b Q = 5. ... We used an ADAM optimizer with a standard learning rate of 5e 4. We performed a grid search over the weight decay term {0.0, 0.00001, 0.0001, 0.001}. ... The block size for CMANP-AND is set as b Q = 5. During training, Celeb A (128x128), (64x64), and (32x32) used a mini-batch size of 25, 50, and 100 respectively.