Transformer Neural Processes: Uncertainty-Aware Meta Learning Via Sequence Modeling
Authors: Tung Nguyen, Aditya Grover
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we show that TNPs achieve state-of-the-art performance on various benchmark problems, outperforming all previous NP variants on meta regression, image completion, contextual multi-armed bandits, and Bayesian optimization. |
| Researcher Affiliation | Academia | Tung Nguyen 1 Aditya Grover 1 1Department of Computer Science, UCLA. Correspondence to: Tung Nguyen <tungnd@cs.ucla.edu>. |
| Pseudocode | No | The paper describes procedures but does not include a figure, block, or section labeled 'Pseudocode' or 'Algorithm'. |
| Open Source Code | Yes | We have open-sourced the codebase for reproducing our experiments.1 The implementation of the baselines is borrowed from the official implementation of BNPs.2 1https://github.com/tung-nd/TNP-pytorch |
| Open Datasets | Yes | We use two datasets for this experiment: EMNIST (Cohen et al., 2017) and Celeb A (Liu et al., 2018)... and We compare TNPs with the baselines on the wheel bandit problem introduced in Riquelme et al. (2018) and we use various benchmark functions in the optimization literature (Kim & Choi, 2017; Kim, 2020). |
| Dataset Splits | No | No explicit mention of separate validation splits. The paper describes dynamic splitting into context and target points: 'For each fi, we choose N random locations to evaluate, and sample an index m that splits the sequence to context and target points. For all methods, ℓ U[0.6, 1.0), σf U[0.1, 1.0), B = 16, N U[6, 50), m U[3, 47).' |
| Hardware Specification | Yes | We measure run time on the 1-D regression task on an RTX2080Ti, with 1000 batches of size 16. |
| Software Dependencies | No | The paper mentions open-sourced codebase but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | In this section we present the hyperparameters that we used to train TNPs in the experiments. ... Model dimension: 64, Number of embeddings layers: 4, Feed forward dimension: 128, Number of attention heads: 4, Number of transformer layers: 6, Dropout: 0.0, Number of training steps: 100000, Learning rate: 5e 4 with Cosine annealing scheduler |