Bayesian Context Aggregation for Neural Processes
Authors: Michael Volpp, Fabian Flürenbrock, Lukas Grossberger, Christian Daniel, Gerhard Neumann
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present experiments to compare the performances of BA and of MA in NP-based models. |
| Researcher Affiliation | Collaboration | 1Bosch Center for Artificial Intelligence, Renningen, Germany 2Karlsruhe Institute of Technology, Karlsruhe, Germany 3University of Tübingen, Tübingen, Germany |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. It provides architectural diagrams but not code-like descriptions of procedures. |
| Open Source Code | Yes | We publish source code to reproduce the experimental results online.4 https://github.com/boschresearch/bayesian-context-aggregation |
| Open Datasets | Yes | We use the MNIST database of 28 28 images of handwritten digits (Le Cun and Cortes, 2010) |
| Dataset Splits | No | The paper does not provide specific training/validation/test dataset splits in terms of percentages or absolute counts for the generated datasets. It describes how context and target sets are sampled dynamically for each task during training, but not a fixed reproducible split of a master dataset. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions 'Adam optimizer' and 'Optuna framework' but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | we consistently optimize the encoder and decoder network architectures, the latent-space dimensionality dz, as well as the learning rate of the Adam optimizer (Kingma and Ba, 2015), independently for all model architectures and for all experiments using the Optuna (Akiba et al., 2019) framework, cf. App. 7.5.3. If not stated differently, we report performance in terms of the mean posterior predictive log-likelihood over 256 test tasks with 256 data points each, conditioned on context sets containing N {0, 1, . . . , Nmax} data points (cf. App. 7.5.4). For sampling-based methods (VI, MC, ANP), we report the joint log-likelihood over the test sets using a Monte-Carlo approximation with 25 latent samples, cf. App. 7.5.4. |