Secure Out-of-Distribution Task Generalization with Energy-Based Models

Authors: Shengzhuang Chen, Long-Kai Huang, Jonathan Richard Schwarz, Yilun Du, Ying Wei

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on four regression and classification datasets demonstrate the effectiveness of our proposal. In the experiments, we test EBML on both few-shot regression and image classification tasks in search for answers to the following key questions: RQ1: Whether the improved expressiveness of EBML over traditional Bayesian meta-learning methods can lead to a more accurate model of the meta-training ID task distribution, hence a more reliable OOD task detector. RQ2: Whether Energy Sum can be an effective score for detection of OOD meta-testing tasks. RQ3: Whether EBML instantiated with SOTA algorithms can exploit the meta-learned EBM prior in OOD task adaptation to achieve better prediction performance on OOD tasks.
Researcher Affiliation Collaboration 1City University of Hong Kong 2Tencent AI Lab 3University College London 4Massachusetts Institute of Technology 5Nanyang Technological University
Pseudocode Yes The complete pseudo codes for meta-training of EBML are available in Appendix E. Pseudo code for the EBML adaptation and inference algorithms described above can be found in Appendix E.
Open Source Code No The paper does not provide an explicit statement or link for the open-source code release of their methodology.
Open Datasets Yes We use the lbap-general-ic50-size ID/OOD task split in the Drug OOD [21] benchmark... Meta-dataset [49] 5-way 1-shot Classification This experiment considers image classification problems on Meta-dataset [49].
Dataset Splits No The paper specifies task domains for meta-training and meta-testing (e.g., '222/145/23 domains by molecular size for ID Train / ID Test / OOD Test' for Drug OOD), and task structures (e.g., 'Each training task consists of 2 to 5 support and 10 query points'). However, it does not explicitly detail a separate validation dataset split.
Hardware Specification No The paper does not provide any specific hardware details such as GPU/CPU models, memory, or cloud instance types used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies or version numbers (e.g., Python, PyTorch, TensorFlow versions) that would be needed to replicate the experiments.
Experiment Setup Yes For more experimental details, hyper-parameter configurations, and additional experimetal results, please refer to Appendix B and C. (Appendix B.2 'Hyperparameters and Training Details' provides specific values for learning rates, batch sizes, optimizers, and training steps).