Neural Processes with Stability

Authors: Huafeng Liu, Liping Jing, Jian Yu

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To illustrate the superiority of the proposed model, we perform experiments on both synthetic and real-world data, and the results demonstrate that our approach not only helps to achieve more accurate performance but also improves model robustness.
Researcher Affiliation Academia Huafeng Liu, Liping Jing, Jian Yu Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University The School of Computer and Information Technology, Beijing Jiaotong University {hfliu1, lpjing, jianyu}@bjtu.edu.cn
Pseudocode Yes Algorithm 1 Learning algorithm for stable NPs
Open Source Code No The paper does not provide any explicit statement about open-sourcing the code for the described methodology, nor does it provide a link to a code repository.
Open Datasets Yes We trained the NPs on EMNIST [4] and 32 × 32 CELEBA [23] using the standard train/test split with up to 200 context/target points at training.
Dataset Splits No The paper mentions "standard train/test split" for EMNIST and CELEBA but does not explicitly mention a validation split or its details.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies No The paper mentions using Python and libraries like PyTorch implicitly through common practice in deep learning research, but does not specify any software names with version numbers required for reproducibility.
Experiment Setup Yes For synthetic 1D regression experiments, the neural architectures for CNP, NP, ANP, BCNP, BNP, BANP, and our SCNP/SNP/SANP refer to Appendix B. The number of hidden units is dh = 128 and latent representation dz = 128. The number of layers are le = lde = lla = lqk = lv = 2. We trained all models for 100,000 steps with each step computing updates with a batch containing 100 tasks. We used the Adam optimizer with an initial learning rate 5e-4 and decayed the learning rate using Cosine annealing scheme for baselines. For SCNP/SNP/SANP, we set K = 3. The size of the context C was drawn as |C| U(3, 200). Testings were done for 3,000 batches with each batch containing 16 tasks (48,000 tasks in total).